# Megaminx Scrambler



## hawkmp4 (Oct 30, 2009)

I read Dan Hayes' post on the speedsolving yahoo group (http://games.groups.yahoo.com/group/speedsolvingrubikscube/message/41005) a while ago, and now I have the programming ability to implement some of the ideas.
So, I threw together a quick scrambler that generates 125 move scrambles (not counting cube turns). It allows 1/5 and 2/5 turns in both directions. It rejects sequences like D Y D or Y D Y and duplicate turns, of course. Cube turns are randomly interspersed and as I said, are not counted in the move count. 
Now, the quality of the scrambles is phenomenal. No problems whatsoever. The problem lies in...well...
Here's a sample output...
D+ R-- D-- Y++ R+ Y+ R++ D- R++ Y+ R++ D++ Y+ R- Y-- R++ D+ Y++ R- D-- R-- D++ R-- D+ Y- R- D- R- Y++ D- R+ Y-- D- R++ D+ Y++ R-- D-- R++ Y+ R-- D-- R++ Y-- D- R+ Y-- D- R- D+ Y+ R++ Y- R- D-- Y++ R+ D-- Y- R++ D+ Y++ R-- Y-- D+ R++ D+ R- D+ Y+ R+ D-- R+ D-- R+ Y++ D++ R- Y++ D++ R++ D-- R+ D++ R-- Y++ R- D-- Y+ R+ Y-- R-- Y+ D++ R-- D++ R+ D++ R++ D- Y- R+ Y- D-- R-- D++ Y- R- D++ Y- R++ Y+ D-- R+ Y- R-- D- R++ Y++ R++ D-- Y++ R- D- Y+ R+ D++ Y++ R-- Y-- D+ R+ Y+ R-- D- Y+ R+ D++ R++ Y+ D+ R++ D++ R+ D++ Y- R-- Y- D-- R++ Y++ R++ Y-- D- R- Y- D+ R++ Y++ D+ R- Y++ R- Y+ D++ R- D+ Y-- R- Y++ R- Y-- R++ Y+ R- Y-- D-- R-- 
Frankly, it takes forever. It takes me longer to scramble than solve.
Does anyone have any ideas for improving this, without compromising the scramble quality?


----------



## Kickflip1993 (Oct 30, 2009)

as far as i know, normal scrambles are only 60 turns and have a U or U' as every 10th turn


----------



## Jake Gouldon (Oct 30, 2009)

We need to find a way to get better megaminx scrambles, and I think the non-Pochman system did that. There are many more possible cube states using it.

As to your scrambler, perhaps a lower move-count? I would say 100 is plenty.


----------



## hawkmp4 (Oct 30, 2009)

Neither of you have seemed to read the link I supplied.


----------



## qqwref (Oct 30, 2009)

The statistics in the link you posted are pretty weird. I'm not really sure to make of something that says we need >50 moves to get an adequate 3x3 scramble... it's just so far away from what we've observed as a community that I would definitely have to do the analysis myself to be sure.

Anyway, if you want high-quality megaminx scrambles without doing a huge number of moves, I think the best solution is to go back to 3x3-style scrambles... give each face a name and just randomly do 60 or 70 moves, like the 3x3 scrambler. It's just not practical to have to read and do 120 or so moves for something that's going to take only a minute for a fast competitor to solve.


----------



## Lucas Garron (Oct 30, 2009)

hawkmp4 said:


> It rejects sequences like D Y D or Y D Y and duplicate turns, of course.


Why?



hawkmp4 said:


> Now, the quality of the scrambles is phenomenal. No problems whatsoever.


How does the phenomenality compare to current official scrambles?
...if we use the same scramble lengths? (Count rotations, because those take time to do.)


----------



## hawkmp4 (Oct 30, 2009)

Lucas Garron said:


> hawkmp4 said:
> 
> 
> > It rejects sequences like D Y D or Y D Y and duplicate turns, of course.
> ...



D Y D can be rewritten as some combination of a single D move and a single Y move. So I didn't think it was necessary. For example, D++ Y+ D- is more effectively written as D+ Y+, which the scrambler is perfectly capable of generating.

I haven't done enough scrambles to really see empirically the difference, I'll be honest. But I've had rather poor results with the Pochmann scrambles. Way more intact CE pairs than I think there should be with 70 turns, which Dan's statistics seem to support. I haven't seen anything like that with the scrambler I wrote. Yet, at least. I could be wrong.

qq- Rather than rejecting the statistics because the result seems weird, can you explain what's wrong with them? Dan Hayes lays out pretty clearly the tests he did. Apart from arithmetic error, I don't see anything that could be a problem. It's not what we'd expect, no. That doesn't mean it should be discounted without examination. If you do see something though, please let me know. I'm curious.

In regards to the old style of scrambling- unfortunately I think that is the best way to go. Slow, but effective.


----------



## qqwref (Oct 30, 2009)

hawkmp4 said:


> qq- Rather than rejecting the statistics because the result seems weird, can you explain what's wrong with them? Dan Hayes lays out pretty clearly the tests he did. Apart from arithmetic error, I don't see anything that could be a problem. It's not what we'd expect, no. That doesn't mean it should be discounted without examination. If you do see something though, please let me know. I'm curious.



Sure, he says what tests he did, but I doubt the results. No calculations/code are shown so I don't even know if he calculated anything correctly; arithmetic error is a huge fact. For instance he lists expected SDs of 360 and 279 without any explanation. Don't accuse me of rejecting things "without examination"... did you even notice the post was in 2008? There is no backing evidence or data, or macros we can run ourselves, just a bunch of numbers someone posted. I'm not rejecting the results automatically, but if his analysis gives numbers that are much different from what we expect, I personally think it is more likely that he did something numerically wrong than that we need 50 moves to get good 3x3 scrambles. It could be either way, but we would have to see the code to decide.


----------



## hawkmp4 (Oct 31, 2009)

qqwref said:


> hawkmp4 said:
> 
> 
> > qq- Rather than rejecting the statistics because the result seems weird, can you explain what's wrong with them? Dan Hayes lays out pretty clearly the tests he did. Apart from arithmetic error, I don't see anything that could be a problem. It's not what we'd expect, no. That doesn't mean it should be discounted without examination. If you do see something though, please let me know. I'm curious.
> ...


I'm sorry- I didn't mean that to be offensive in any way. I have the utmost respect for you. I just meant that you provided no more justification for your dismissal than he did for the numbers. I've sent Dan an email for the programs. I guess we can't do much more until we can go over them, can we? 

Also, I wasn't around during the change, why was the previous method abandoned? Ease of use for non-Megaminxers, for lack of a better word?


----------



## qqwref (Oct 31, 2009)

hawkmp4 said:


> Also, I wasn't around during the change, why was the previous method abandoned? Ease of use for non-Megaminxers, for lack of a better word?



Pretty much. As soon as Stefan introduced the new scrambler everyone loved it - it IS far easier to do (a lot of people commented that they pretty much never made a mistake with it, whereas the original scrambler was easy to make errors with) and the scrambles it produces look scrambled enough (ha). As far as I can tell nobody seems to mind that it only generates 10^21 out of the 10^68 possible positions.

I would like to see some kind of analysis of, on average after X random moves, how close edges and corners are to being paired up. What I mean is, you take every adjacent corner and edge from the solved state, and see how many moves it would take to pair them up again from the scramble. A value for this that is significantly lower than the random scramble average would signify that however many moves is not enough. I think this is a better metric than many others (since it's obviously related to how easy the scrambles are to actually solve) but I've never seen it computed.


----------



## hawkmp4 (Oct 31, 2009)

That does bother me quite a lot. That's a significant amount of states we just can't generate with this scrambler. It IS a lot easier, but it makes me squirm. That's why I've been working on a better scrambler, if only for my own use. 
That would be a very interesting analysis- but well over my head. I'd have no idea how to approach that.


----------



## qqwref (Oct 31, 2009)

OK, I did a basic analysis looking at the # of stickers of the center color on each face for 3x3. I used Jaap's old scrambler and did 100k scrambles each (so 600k faces) and used a version of the chi-square test to check if the distributions of the number of same-color stickers were the same for X turn scrambles as for 1000 turn scrambles.

My results somewhat correlated with the results you linked to, which I found a bit surprising. The test said that the 40, 50, and 100 turn scrambles were probably not from different distributions as the 1000-turn scrambles, but the 30-turn scramble failed the test (p = 0.00993 that they were from the same distribution) and the 25-turn scramble failed it really hard (p ~= 10^-15). So as far as this variable is concerned 25 turns is really not enough, and we might even need a bit more than 30 to get something that doesn't look not random.


----------



## spdqbr (Oct 31, 2009)

There has been some analysis like this done here. I am trying to dig up the code I used to generate those numbers, I should still have it somewhere.


----------



## Stefan (Oct 31, 2009)

hawkmp4 said:


> Now, the quality of the scrambles is phenomenal.


"Phenomenal"? Are you doing mathematics or marketing? Numbers, please.



hawkmp4 said:


> why was the previous method abandoned? Ease of use *for non-Megaminxers*, for lack of a better word?


Change that to "for non-Megaminx-official-scramblers". Cause I consider myself a decent Megaminxer and I had trouble doing the 60 moves in five minutes. With the new method I can do the 70 moves in 50 seconds easily (and others do it much faster). In other words, I could do 420 moves with the new method in the same time the old method takes me.

And I believe Daniel did make some mistake:


> Megaminx:
> WCA Scrambler
> Turns | Std Dev | Expected Std Dev | Confidence
> 10 |108483.94| 279 | 0.3%
> ...


Note how starting at 300 moves the measured standard deviation drops further and further below the expected standard deviation, and the confidence shrinks accordingly. I'm no statistician, either, but that seems rather odd to me. I think it should just get closer and closer to 279 from above. Daniel, if you can still run the program to get those numbers, can you look how it develops with even more moves (2000, 3000, etc)?


----------



## hawkmp4 (Oct 31, 2009)

StefanPochmann said:


> hawkmp4 said:
> 
> 
> > Now, the quality of the scrambles is phenomenal.
> ...


Fair enough, I'll cut the rhetoric.

Okay. I wasn't sure. I've never used the old scrambler so it really was an honest question. The question that lurks in my mind though is whether those 420 moves are really as effective as they could be.

That also bothered me- which is why I wanted to see if the results were reproduced. I have confidence in Daniel's work, absolutely, but that is an odd phenomena that no one ever came up with an explanation for.


----------



## Stefan (Oct 31, 2009)

hawkmp4 said:


> Fair enough, I'll cut the rhetoric.


Well, if you have a reason to call it phenomenal, then please, by all means, call it that. Just let us know the reason as well.



Jake Gouldon said:


> We need to find a way to get better megaminx scrambles, and I think the non-Pochman system did that. *There are many more possible cube states using it.*


So the low number of possible outcomes is your complaint? How is that bad? You think 10^21 scrambles isn't enough for all of us for a long time?


----------



## qqwref (Oct 31, 2009)

StefanPochmann said:


> So the low number of possible outcomes is your complaint? How is that bad? You think 10^21 scrambles isn't enough for all of us for a long time?



If nothing else, it's an indication that there is almost certainly some kind of bias. The set of positions that the scrambler can generate is a minuscule subset of the set of possible positions, and since the scrambler positions were not chosen as random samples from the larger set (but simply by looking at every position that can be reached in 70 pochmann-scrambler moves) it is very likely that the positions that the scrambler can reach are significantly different (on average) than the positions that you would get from just randomly choosing one minx position. There's no way to know if the positions the scrambler generates are easier/harder/closer to solved/less blocky/whatever compared to the positions a uniform random distribution would generate, but the fact that the number of positions the scrambler generates is so small means that there is almost certainly a significant difference. We just don't know what the difference is.

I think in the cube community there is a bit of a problem with people assuming scramblers 'work' (i.e. provide a distribution of solves that closely resembles the one we would get by randomly picking a cube state) when they aren't proven to. For some scramblers, such as optimal random state scramblers, it is trivial to prove this is true; for others, I think we tend to just assume that if positions look scrambled they are, without really doing an analysis. For example the Square-1 scrambler programmed by Jaap, which is currently in use for WCA scrambles, seems to provide parity only ~40% of the time in practice when the puzzle is solved with optimal cubeshape. Your megaminx scrambler might provide positions that look scrambled, but this is not enough to show the scramble is as hard on average as it should be - try doing 40 random moves on a 7x7. Personally I think the fact that the megaminx scrambler generates such a small fraction of the positions is evidence against its having a distribution similar to a theoretical optimal random scrambler, and I think we really cannot be sure whether it provides positions that are truly as difficult as random positions or whether it just provides positions that appear scrambled but might not be.


----------



## Mike Hughey (Oct 31, 2009)

One thing that I couldn't help noticing was that the week we started using Stefan's scrambles for megaminx on the weekly competitions, my times suddenly seemed to get a lot better. Of course, there are lots of possible explanations for this: I simply got better that week; not having to apply the old scrambles meant I was less fatigued so I was better at actually solving; psychologically I felt they were better so I went faster; just coincidence due to insufficient data points; etc. But it is something I noticed, FWVLIW (for what very little it's worth).


----------



## hawkmp4 (Nov 1, 2009)

Mike Hughey said:


> One thing that I couldn't help noticing was that the week we started using Stefan's scrambles for megaminx on the weekly competitions, my times suddenly seemed to get a lot better. Of course, there are lots of possible explanations for this: I simply got better that week; not having to apply the old scrambles meant I was less fatigued so I was better at actually solving; psychologically I felt they were better so I went faster; just coincidence due to insufficient data points; etc. But it is something I noticed, FWVLIW (for what very little it's worth).


That's how I've felt using my scrambler. It's not as easy to find the right C/E pair, and my 'cross' edges aren't nearly always in the bottom half of the minx. It just feels to be more random. I'd never known the numbers of states possible by the Pochmann scrambler, but as qqwref said, I doubt that they're a representative sample of all the scrambles.


----------



## Stefan (Nov 2, 2009)

hawkmp4 said:


> That's how I've felt using my scrambler. It's not as easy to find the right C/E pair, and my 'cross' edges aren't *nearly always in the bottom half of the minx.*


Again: Numbers, please.

This did sound interesting so I did a small test with one WCA scramble:
R++ D++ R++ D++ R++ D++ R-- D-- R-- D++ U
R-- D-- R-- D-- R++ D-- R-- D++ R++ D++ U
R-- D-- R++ D++ R++ D-- R++ D++ R++ D-- U'
R-- D-- R-- D++ R++ D-- R-- D-- R++ D-- U'
R++ D++ R++ D-- R-- D-- R++ D-- R++ D++ U
R++ D-- R-- D-- R-- D-- R++ D++ R-- D-- U'
R-- D++ R-- D-- R-- D++ R-- D++ R-- D++ U

I counted where the edges of each face ended up. On/adjacent to their goal layer (10 positions), in the middle zig-zag (10 positions), or on/adjacent to the layer opposite their goal layer (10 positions). I did it for all sides cause I don't know how you hold your minx before scrambling, and to get more data. The results:

white: 1/2/2
brown: 3/2/0
blue: 1/2/2
purple: 2/1/2
green: 0/3/2
turquoise: 2/2/1
yellow: 2/3/0
red: 1/2/2
orange: 0/1/4
green2: 0/2/3
pink: 0/4/1
blue2: 0/2/3

Sum: 12/26/22

Assuming you meant bottom=whereTheCrossIs, here I experienced the exact opposite of what you describe. But please, share your numbers with us that you had but forgot to post along with your claim.

Of course on average the result should be 20/20/20 since all three areas are the same size, so my above result deviates quite a bit. If someone's able to test this on a larger scale with a program, I'd be very interested in the results.


----------



## CharlieCooper (Nov 2, 2009)

StefanPochmann said:


> hawkmp4 said:
> 
> 
> > Now, the quality of the scrambles is phenomenal.
> ...



Hilarious.


----------



## hawkmp4 (Nov 2, 2009)

hawkmp4 said:


> Mike Hughey said:
> 
> 
> > One thing that I couldn't help noticing was that the week we started using Stefan's scrambles for megaminx on the weekly competitions, my times suddenly seemed to get a lot better. Of course, there are lots of possible explanations for this: I simply got better that week; not having to apply the old scrambles meant I was less fatigued so I was better at actually solving; psychologically I felt they were better so I went faster; just coincidence due to insufficient data points; etc. But it is something I noticed, FWVLIW (for what very little it's worth).
> ...


I said I felt that way because I didn't have any numbers to support it.
Might I add, though, that one scramble could hardly be called sound evidence, either.


----------



## Stefan (Nov 2, 2009)

hawkmp4 said:


> one scramble could hardly be called sound evidence, either.


Of course. Though I do have data for twelve sides and and the near area won against the far area only three times and lost eight times, and I think the sides are reasonably unrelated to count this as more than one result.

Btw, does anyone know how many outcomes our Cube Explorer scrambling for 3x3x3 can produce?


----------



## hawkmp4 (Nov 2, 2009)

StefanPochmann said:


> Btw, does anyone know how many outcomes our Cube Explorer scrambling for 3x3x3 can produce?



I was under the impression that Cube Explorer picked a random state and solved it. So all of the legal states of the cube.


----------



## Stefan (Nov 2, 2009)

hawkmp4 said:


> I was under the impression that Cube Explorer picked a random state and solved it. So all of the legal states of the cube.


I doubt it. That would require a pretty darn good randomness generator.


----------



## qqwref (Nov 2, 2009)

I don't want to get into an argument about how good the various pseudo-RNGs we use are. That argument is really not fun, and there is nothing we can do to affect it anyway. The best we can try to do as cubers is to create an algorithm which is as close to random as possible _given an ideal RNG_, because then if the statistics turn out wrong the error will at least not be on our end.

Given a perfect RNG, though, picking a random cube position is really easy. I'm not sure what you're trying to say Stefan :|


----------



## Stefan (Nov 2, 2009)

qqwref said:


> Given a perfect RNG, though, picking a random cube position is really easy. I'm not sure what you're trying to say Stefan :|


Just curious, I don't remember having this answered in the previous discussions. I don't know what RNG Cube Explorer uses, but probably not a perfect one. And if it uses let's say a 32-bit one then the number of possible outcomes is small.


----------



## Stefan (Nov 2, 2009)

Another thought: I don't know Javascript's PRNG and it might differ between implementations, but let's say it uses some 64-bit one so the current WCA megaminx scrambler can't even generate 2^70 but at most 2^64 different scrambles. This would limit all other scrambling methods the same way, including the previous WCA one and hawk's and the current WCA one cranked up to 1000 moves and even random state generators. So unless that issue is ruled out, blaiming a scrambler solely for small number of possible outcomes is flawed even as "indication" like qq explained.


----------



## qqwref (Nov 2, 2009)

StefanPochmann said:


> let's say it uses some 64-bit one so the current WCA megaminx scrambler can't even generate 2^70 but at most 2^64 different scrambles.



So what you're saying is, if it's deterministic and based only on one of its generated numbers, there would only be 2^64 possible 70-bit sequences and thus only 2^64 possible scrambles? That's true, and I don't know how many 'states' Javascript's random number generator has, but I hope it's larger than that. (Still, this problem is due to the programming language and not the scrambling algorithm itself - if the scrambling algorithm has a severe limitation on the number of scrambles it can generate, it's still in some sense faulty.)

This might be a bit of an overkill idea, but perhaps in the future we could generate official scrambles using data from random.org (which is literally random as opposed to pseudorandom). One section of the website will generate up to 10000 strings of length up to 20, so you could generate a bunch of those and then algorithmically transform them into scrambles. It's an interesting thought.


----------



## Stefan (Nov 2, 2009)

qqwref said:


> I don't know how many 'states' Javascript's random number generator has, but I hope it's larger than that.


Well, if I understand correctly, *Java* even only uses 48 bits:
http://java.sun.com/javase/7/docs/api/java/util/Random.html



qqwref said:


> Still, this problem is due to the programming language and not the scrambling algorithm itself - *if the scrambling algorithm has a severe limitation on the number of scrambles it can generate, it's still in some sense faulty.*


Please judge this one:

Pick 10^50 states perfectly randomly and hardcode scrambles of optimal length for them into the program. When asked for a scramble, the program perfectly randomly picks one of these scrambles.

Wouldn't you agree that this would result in short and high quality scrambles and would thus be very very good rather than "faulty", despite the severe limitation on the number of scrambles it can generate?


----------



## hawkmp4 (Nov 2, 2009)

Honest question:
Though I see the issue if we were picking a random state of the Megaminx, with scrambles the way they are currently aren't we picking only from the available sides and how far to turn them? And doing this repeatedly? Many many less than 2^32 possibilities. So how is this applicable to the Megaminx scrambler?
Now I never thought of that applied to 3x3, and that's very...bothersome to say the least. That's a very small subset that we can generate.


----------



## Stefan (Nov 2, 2009)

hawkmp4 said:


> Honest question:
> Though I see the issue if we were picking a random state of the Megaminx, with scrambles the way they are currently aren't we picking only from the available sides and how far to turn them? And doing this repeatedly? Many many less than 2^32 possibilities. So how is this applicable to the Megaminx scrambler?


Are you saying that 2^70 is less than 2^32?

I'm also confused about _"picking only from the available sides"_ as that doesn't sound like the current WCA megaminx scrambler which I think you're talking about. That one doesn't pick sides and only picks direction, not how far to turn.


----------



## hawkmp4 (Nov 2, 2009)

I'm saying any of the available megaminx scramblers. The current one, for example, only picks from -- or ++. The old one (I'm not sure the exact mechanics) would pick from at most 12 sides * 4 ways to move each side = 48 possibilities. 2^32 is much greater than that. So how does the number of bits of the PRNG restrict the number of scrambles we can possibly output?


----------



## qqwref (Nov 2, 2009)

If the PRNG only looks at the last thing it output to get the next number (and not all do), then the number of *sequences* of any length is the same as the number of bits it is calculating with. This is because every number output will give you the next number if you just know how to get it. So whether you are looking at one number or generating a bunch of numbers and taking them all mod 2 or 3, there are the same number of possible things that could happen.


----------



## hawkmp4 (Nov 2, 2009)

Okay. That makes sense. Thank you.
...now couldn't that be avoided just by reseeding the PRNG as needed?


----------



## Stefan (Nov 2, 2009)

qqwref said:


> If the PRNG only looks at the last thing it output to get the next number (*and not all do*)


Indeed. The Java one for example uses its internal 48-bit state rather than the previous 32-bit output to determine the next. I admit though that this seems to be a requirement from portability and that Javascript may be much better thanks to only specifying to return a number between 0 and 1.



hawkmp4 said:


> ...now couldn't that be avoided just by reseeding the PRNG as needed?


How often, and with what?


----------



## hawkmp4 (Nov 2, 2009)

After it's 32 bit string has been used, and possibly with the system clock?


----------



## Stefan (Nov 2, 2009)

hawkmp4 said:


> After it's 32 bit string has been used, and possibly with the system clock?


Then you're abusing the PRNG as a simple hash function with bad input.

Oh and at least Mozilla's Javascript engine uses the same 48-bit PRNG as Java:
http://mxr.mozilla.org/mozilla/source/js/src/jsmath.c

So getting your WCA megaminx scrambles via Firefox means limiting yourself to 2^48 or fewer and it has nothing to do with the actual scrambling method.


----------



## arckuss123 (Nov 2, 2009)

cool


----------



## hawkmp4 (Nov 2, 2009)

Call it what you may, but doesn't that provide us with the end result we want?


----------



## Stefan (Nov 2, 2009)

hawkmp4 said:


> Call it what you may, but doesn't that provide us with the end result we want?


Well, while you want it, I find it irrelevant. You really don't need more than 2^32 scrambles. You get no practical advantage from this at all, it just makes things more complicated and gives you a false sense of safety (from the terrifying unrandomness). And you might generate duplicate scrambles if you're fast enough and not careful (side effect of making things complicated is risk of doing something wrong).

What's important is the quality of the scrambling, not the number of possible outcomes (if the latter is reasonable (so don't say I said a megaminx scrambler with ten possible outcomes can be good)).


----------



## Stefan (Nov 2, 2009)

arckuss123 said:


> cool


All 38 posts before yours?


----------



## qqwref (Nov 2, 2009)

I said this wasn't a fun argument to get into :|


----------



## Stefan (Nov 2, 2009)

hawkmp4 said:


> [Reseed] after it's 32 bit string has been used, and possibly *with the system clock*?


Actually... reseeding with the system clock might be *worse* than simply trusting the PRNG's internal state of for example 48 bits to do its job. Cause even if the system clock updated 100 times a second and you lived 100 years, you'd only be able to get about 2^38 different system clock values and thus seeds.


----------



## spdqbr (Nov 2, 2009)

Well, I've scanned all the hard drives I own which I can get spinning, and I have found some, but not all of the code used to calculate the statistics in that post. What I have here is the driving class for testing the different scramble generators. Unfortunately I'm missing a few important classes such as Scrambler, Cube3, and Megaminx. Unfortunately these are where the scrambles are actually applied to the puzzles and those are very tedious to write. Also the code is neither pretty nor streamlined nor commented, so tread carefully.

I will work on regenerating the missing classes and when I get them going / find them, I'll repost them here. But at least with this piece, anyone who wants to can verify that IF my scramble generators and scrambler appliers where coded correctly, then the logic for calculating the statistics is what I claimed it was (whether or not that is the correct way to test scramble validity is an entirely different matter).

A few notes on the code:
It ignores centers on both megaminx and 3x3
The meaning of the megaminx scramblers names' are (as best I can remember):
MEGAMINXDTSR = Double Turn Single Rotation similar to the WCA scrambler in only chooseing ++ -- and ' (no - + ")
MEGAMINXDTSR = Double Turn Variable Rotation (allows ++ -- ' ")
MEAGMINXST = Single Rotation (allows + ++ - -- and ')
There were others available but it would seam that these are the last ones tested.



Spoiler





```
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileWriter;
import java.util.Arrays;
import java.util.StringTokenizer;

public class CubeStats extends MathUtils{
	
	public static void main(String[] args) {
		
		long start = System.currentTimeMillis();
		File out = new File("/home/spdqbr/Desktop/stats.txt");
		try{
		BufferedWriter bw = new BufferedWriter(new FileWriter(out));
		/*
		Scrambler scrambler = new Scrambler(Scrambler.CUBE3)
		
		Cube3 puzzle = new Cube3();
		
		
		int sides = 6;
		int stickersPerFace = 8;
		//int[] scrambleLengths = {200,500,1000};
		int[] mins = {1,10,20,30,35,40,50,1,10,20,30,40};
		int[] maxs = {10,20,30,40,45,50,60,20,30,40,50,60};
		int trials = 1000000;
		int[] totals;
		Integer[] scrambled = null;
		Integer[] scrambledNoCenters = null;
		String scramble;
		for(int z = 0; z<mins.length; z++){
			totals = new int[stickersPerFace*sides*sides];
		for(int i = 0; i<trials; i++){
			puzzle.reset();
			scramble = scrambler.generateScramble(mins[z],maxs[z]);
			//scramble = Megaminx.WCAScrambleString(scrambleLengths[z]);
			puzzle.applyScramble(scramble);
			//puzzle.orient();
			scrambled = puzzle.getState();
			//scrambledNoCenters = stripCenters(scrambled);
			for(int j = 0; j<scrambled.length; j++){
				totals[j*sides+scrambled[j]] ++;
			}
		}*/
		
		
		Scrambler[] scramblers = {new Scrambler(Scrambler.MEGAMINXDTSR),
		new Scrambler(Scrambler.MEGAMINXDTVR), new Scrambler(Scrambler.MEGAMINXSR)};
		int[][] lengths = {new int[] {500,1000,2000},new int[] {175,180,210,220,250,500,1000,2000},
				new int[] {10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 125, 150, 175, 200, 500, 1000,2000}};
		
		Megaminx puzzle = new Megaminx();
		
		for(int si = 0; si < scramblers.length; si++){
		Scrambler scrambler = scramblers[si];
		bw.write("New scrambler "+scrambler.name+"\n");
		bw.write("Sample scramble: "+scrambler.generateScramble()+"\n");
		
		int sides = 12;
		int stickersPerFace = 10;
		int[] scrambleLengths = lengths[si];
		//int[] scrambleLengths = {26,27};
		int trials = 1000000;
		int[] totals;
		Integer[] scrambled = null;
		Integer[] scrambledNoCenters = null;
		String scramble;
		for(int z = 0; z<scrambleLengths.length; z++){
			totals = new int[stickersPerFace*sides*sides];
		for(int i = 0; i<trials; i++){
			puzzle.reset();
			scramble = scrambler.generateScramble(scrambleLengths[z]);
			//scramble = Megaminx.WCAScrambleString(scrambleLengths[z]);
			puzzle.applyScramble(scramble);
			puzzle.orient();
			scrambled = puzzle.getState();
			scrambledNoCenters = stripCenters(scrambled);
			for(int j = 0; j<scrambledNoCenters.length; j++){
				totals[j*sides+scrambledNoCenters[j]] ++;
			}
			
		}

		System.out.println();
		double stddev = stdDev(totals);
		double mean = mean(totals);
		
		bw.write(scrambler.name+"\n");
		System.out.println("mean        = "+mean);
		bw.write("mean        = "+mean+"\n");
		System.out.println("Scramble Length = "+scrambleLengths[z]);
		bw.write("Scramble Length = "+scrambleLengths[z]+"\n");
		System.out.println("StdDev      = "+stddev);
		bw.write("StdDev      = "+stddev+"\n");
		System.out.println("Sd% of mean = "+stddev/mean*100+"\n");
		bw.write("Sd% of mean = "+stddev/mean*100+"\n\n");
		bw.flush();
		}
		}
		bw.flush();
		bw.close();
		}catch(Exception e){
			//absorb
		}
		System.out.println(Test.longToTime(System.currentTimeMillis()-start));
		
	}
	
	public static String substitute(String in){
		StringBuffer out = new StringBuffer();
		StringTokenizer st = new StringTokenizer(in, " ");
		String[] subs = {"Wh", "DG", "Pu", "Or", "Br", "Te", "LG", "Re", "DB", "LB", "Pi", "Ye"};
		while(st.hasMoreTokens()){
			String temp = st.nextToken();
			if(temp.substring(0,1).equals("\n")){
				temp = temp.substring(1,temp.length());
				out.append("\n");
			}
			int num = 0;
			if(temp.length() == 2){
				num = Integer.parseInt(temp.substring(0,1));
			}else if(temp.length() == 3){
				num = Integer.parseInt(temp.substring(0,2));
			}
			out.append(subs[num]+", ");
		}
		return out.toString();
		
	}
	
	public static Integer[] stripCenters(Integer[] in){
		int[] centers = {10, 27, 30, 33, 36, 39, 92, 95, 98, 101, 104, 131};
		Integer[] out = new Integer[in.length-12];
		int offset = 0;
		for(int i = 0; i<in.length; i++){
			if(Arrays.binarySearch(centers, i)>=0)
				offset++;
			else
				out[i-offset] = in[i];
		}
		return out;
	}
}
```


----------



## Stefan (Nov 2, 2009)

Another thought: Pick six megaminx states perfectly randomly and generate scrambles for them. Now call these six scrambles "R++", "R--", "D++", "D--", "U" and "U'". And use them in the WCA scrambler. Still think that 2^70 scrambles means it's bad?

And I just finished an analysis of where the centers end up with what probability:

```
DLB   DLF   DRB   URF   ULF   B     F     U     URB   DRF   ULB   D       min   max      sum
     pink 8.329 8.329 8.326 8.338 8.341 8.327 8.340 8.347 8.338 8.327 8.340 8.320   8.320-8.347 => 100.000
    blue2 8.329 8.329 8.326 8.338 8.341 8.327 8.340 8.347 8.338 8.327 8.340 8.320   8.320-8.347 => 100.000
   orange 8.347 8.347 8.329 8.320 8.338 8.336 8.331 8.321 8.320 8.336 8.331 8.346   8.320-8.347 => 100.000
   purple 8.338 8.338 8.341 8.329 8.326 8.340 8.327 8.320 8.329 8.340 8.327 8.347   8.320-8.347 => 100.000
turquoise 8.320 8.320 8.338 8.347 8.329 8.331 8.336 8.346 8.347 8.331 8.336 8.321   8.320-8.347 => 100.000
   green2 8.340 8.340 8.328 8.327 8.339 8.332 8.334 8.331 8.327 8.332 8.334 8.336   8.327-8.340 => 100.000
    green 8.327 8.327 8.339 8.340 8.328 8.334 8.332 8.336 8.340 8.334 8.332 8.331   8.327-8.340 => 100.000
    white 8.326 8.326 8.347 8.341 8.320 8.339 8.328 8.329 8.341 8.339 8.328 8.338   8.320-8.347 => 100.000
     blue 8.338 8.338 8.341 8.329 8.326 8.340 8.327 8.320 8.329 8.340 8.327 8.347   8.320-8.347 => 100.000
      red 8.340 8.340 8.328 8.327 8.339 8.332 8.334 8.331 8.327 8.332 8.334 8.336   8.327-8.340 => 100.000
    brown 8.327 8.327 8.339 8.340 8.328 8.334 8.332 8.336 8.340 8.334 8.332 8.331   8.327-8.340 => 100.000
   yellow 8.341 8.341 8.320 8.326 8.347 8.328 8.339 8.338 8.326 8.328 8.339 8.329   8.320-8.347 => 100.000
```

See also:
http://www.stefan-pochmann.info/spocc/other_stuff/tools/scramble_megaminx/


----------



## qqwref (Nov 2, 2009)

Sorry, but I don't see why anyone would care where the centers end up, since no piece is fixed in the scramble. The centers really have nothing to do with the scramble quality. Besides, this is such a small part of the puzzle (and one which can be solved with about 3 moves) that of course it will look completely random after 70 moves. Looking at where the green-white-red corner ends up on a 70-move 100x100 scramble would also be indistinguishable from random, but that doesn't mean the scramble is good. (And really, you can never prove a non-purely-random scrambler is perfect, only that it is flawed or that you have not yet noticed any flaw.)

In addition your silly arguments do not prove that having 2^70 possible combinations is not a problem. It's the algorithm I am concerned with, not the output. Your examples also centered around the idea of using randomness to create the initial list, and thus were strictly better (i.e. closer to random) than the original scrambling algorithm, which relies on 70 single moves. The 2^70 limit does not prove that there is no macroscopic randomness, but just suggests it, because we have chosen some tiny subset of the sample space and then assumed it is representative of the entire thing, without any reason to believe it is. It's not like we randomly chose 2^70 positions - they are determined algorithmically to be all positions which are 70 Pochmann-scrambler moves away, and we really know nothing about this set.


----------



## Stefan (Nov 2, 2009)

qqwref said:


> Sorry, but I don't see why anyone would care where the centers end up


I care because I was curious about it by itself and because I can now extend my program to analyze more stuff. Centers just seemed the easiest thing to implement, so I started with that. And if others implement their own independent analyzer and are able to compute the same thing, we can compare our results to find mistakes or be more confident that we didn't make any. Also, if I had found a significant deviation or pattern for centers already then I would've worried about the scrambling method at this point already. You say _"of course it will look completely random"_, but I wasn't so sure and wanted to get us actual numbers. But I'm sorry if me posting these results bothered you.

Hawk said he felt his cross edges feel too easy to solve so I intend to get us some actual numbers for that, too. And you said you'd like to see an analysis of how close edges and corners are to being paired up, and I intend to get us that as well.



qqwref said:


> Besides, this is such a small part of the puzzle (and one which can be solved with about 3 moves) that *of course it will look completely random after 70 moves*. Looking at where the green-white-red corner ends up on a 70-move 100x100 scramble would also be *indistinguishable from random*


Wrong. To be indistinguishable from random, all my values would have to be 8.333% (exactly 1/12). Clearly this is not the case.



qqwref said:


> In addition your silly arguments do not prove that having 2^70 possible combinations is not a problem.


1. I don't understand why they're silly. Can you explain?
2. Having 2^70 possible combinations is not a problem.
3. I would never claim they prove it. But I think they help illustrating it better. Besides, their main purpose is to rebut the opposite idea, that a small number is a problem.



qqwref said:


> Your examples also centered around the idea of using randomness to create the initial list, and thus were strictly better (i.e. closer to random) than the original scrambling algorithm, which relies on 70 single moves.


Um, no, I'd say they're better because they offer more change than simple single turns. Note that the six scrambles are hardcoded into the supposed program once and for all time and for everyone. There are about 10^408 possible such programs and yes my supposed creation would use randomness to pick one, but each one of these programs has no more randomness than the current WCA scrambler.



qqwref said:


> The 2^70 limit does not prove that there is no macroscopic randomness, but just *suggests it*, because we have chosen some tiny subset of the sample space and then assumed it is representative of the entire thing, without any reason to believe it is. It's not like we randomly chose 2^70 positions - they are determined algorithmically to be all positions which are 70 Pochmann-scrambler moves away, and *we really know nothing about this set*.


Wait. I'm confused. Do we know nothing about it or do we know that it suggests lack of randomness? What is it?


----------



## Herbert Kociemba (Nov 2, 2009)

Motivated by the discussion here I took a closer look at the Assembler Code of the Random function Random(N) of the Cube Explorer in Delphi6, which computes a pseudorandom integer between 0 and N-1. RandSeed is a 32 bit integer variable and has 2^32 different values. On entry, the EAX register contains N.
I added some comments

XOR EBX, EBX 
; This sets EBX to 0

IMUL EDX,[EBX].RandSeed,08088405H
INC EDX 
MOV [EBX].RandSeed,EDX
; so the new RandSeed is (134775813*RandSeed + 1) (mod 2^32)
; At least if the multiplication would be unsigned. But because it is signed,
; this may be wrong - I do not know exactly. 

MUL EDX
;Unsigned multiplication of EDX and EAX, Result in EDX:EAX. After this 
;Operation EDX holds (RandSeed*N)/2^32, which is a number
;between 0 and N-1. This number is returned by the function.

The RandSeed sequence has the full period 2^32.
So the Random Generator gives 2^32=4294967296 different values.
To generate a Random cube I use Random(8!), Random(12!) Random(2^11) and Random(3^7) for random generation of the permutations and orientations of corners and edges.
If the cube has odd parity, the cube is discarded. This will happen for approximately 1/2 of the random cubes.

We will get "only" about 2^31 different valid cubes by the built in random generator. This is about 4.96505*10^-9 percent of the cube space....


----------



## qqwref (Nov 2, 2009)

StefanPochmann said:


> Also, if I had found a significant deviation or pattern for centers already then I would've worried about the scrambling method at this point already.


I wouldn't be (any more) worried if that happened. The goal of a scramble is to scramble the pieces relative to the centers. In the extreme case, if the centers were oriented in the same way every time, it would not necessarily affect the scramble quality at all; look at the 3x3 scrambler. Nobody cares if their cube is in a random orientation because 15 seconds is plenty of time to fix that. Essentially the orientation of the puzzle is a variable that runs 'perpendicular' to the position of the puzzle, and one can be easily modified without affecting the other, so knowing the distribution of one gives us absolutely no data about the distribution of the other.



StefanPochmann said:


> qqwref said:
> 
> 
> > Looking at where the green-white-red corner ends up on a 70-move 100x100 scramble would also be *indistinguishable from random*
> ...


You're completely wrong here. A discrete random variable will almost never (probability -> 0 as number of trials -> infinity) have exactly the distribution that the probabilities say; that's why we have to use statistical tests to check if something is random or not. When I say indistinguishable from random I mean that there is no statistically significant difference between the results you get and the results a random state scrambler would turn up.



StefanPochmann said:


> Having 2^70 possible combinations is not a problem.


There are approximately 12*4 * (11*4)^12 (this is about 2^71) positions that are 13 moves away from solved on the megaminx. So here is a hypothetical scrambler: choose a position that is 13 or fewer moves away from solved on the megaminx, and provide the optimal solution. This should have ~2^70 positions within an order of magnitude or two. Clearly this is not a good scrambler.
You seem to be completely misinterpreting me: I am not saying that 2^70 combinations means there MUST be a problem; I am instead saying that it means there is PROBABLY a problem. It is a red flag that suggests we need further investigation before we will know if the scrambling method is acceptable. You have chosen 2^70 elements of the megaminx positions set without knowing which elements you chose. Perhaps it is a representative sample, but perhaps it is not and the scrambles are in some way easier (or harder!) than average. I think we need to know this.



StefanPochmann said:


> Wait. I'm confused. Do we know nothing about it or do we know that it suggests lack of randomness? What is it?


For someone who is so pedantic you are making quite a lot of thought errors. The _definition of the set_ suggests that it may lack randomness, because the set is a tiny subset of the set of megaminx positions. It's a tiny subset which is specifically defined, and we do not know whether this definition provides good scrambles or scrambles that are on average biased in a certain way, because the definition is not "we randomly chose 2^70 positions and solved them" but rather something more algorithmic. As for no knowledge, it is true: we know pretty much nothing about _the set itself_ except that it contains all of the scrambles which are 70 Pochmann-scrambler moves from solved. I really have no clue about the macroscopic properties of the set itself, which is why we need to do more analyses.




Herbert Kociemba said:


> We will get "only" about 2^31 different valid cubes by the built in random generator. This is about 4.96505*10^-9 percent of the cube space....


Perhaps we *should* start using random.org data then...


----------



## Stefan (Nov 2, 2009)

Herbert Kociemba said:


> We will get "only" about 2^31 different valid cubes by the built in random generator. This is about 4.96505*10^-9 percent of the cube space....


Thanks a lot for checking that. I've been wondering for quite a while and tried looking for information about Pascal's or Delphi's PRNG but couldn't find anything.



qqwref said:


> Essentially the orientation of the puzzle is a variable that runs 'perpendicular' to the position of the puzzle, and one can be easily modified without affecting the other, so knowing the distribution of one gives us absolutely no data about the distribution of the other.


1. You can and apparently do think of the centers solely as a frame, a reference defining where all other pieces have to move to. But that's just one perspective. I can equally well choose to use the white/purple/green corner or the yellow/orange edge as reference and regard the centers as one moving piece.
2. If already the centers deviate significantly, I would worry that other pieces, scrambled with the same algorithm, might deviate similarly.
3. The scrambler does *not* pick a random side to turn next, it can only "turn" the top or the left one. Thus "puzzle orientation" does matter.



qqwref said:


> StefanPochmann said:
> 
> 
> > Wrong. To be indistinguishable from random, all my values would have to be 8.333% (exactly 1/12). Clearly this is not the case.
> ...


No you're wrong. You'd be right if I let's say analyzed a million somewhat randomly selected scrambles. But I didn't. I analyzed all 2^70 possible scrambles. I'm not analyzing some outputs of the system. I'm analyzing the system. And proved that it introduces a bias that a truly random one wouldn't have.



qqwref said:


> StefanPochmann said:
> 
> 
> > Having 2^70 possible combinations is not a problem.
> ...


I think I do and did understand you. You made your position quite clear early in the thread. But you and hawk still somewhat give the impression that a small number of outcomes is a problem. One example is you picking the above quote of mine and commenting a lot on it as if it were wrong. But I only said that a small number of outcomes is not a problem. That's not wrong. It really isn't a problem. It can indicate there might be a problem, but it's not a problem itself. In your hypothetical scrambler, the small number of scrambles isn't the problem, the scrambles almost not scrambling is the problem.

And I wouldn't like people to read this thread and think it is a problem. People shouldn't worry about the quality of our scrambles purely because of a small number of possible outcomes. Like for example...



qqwref said:


> Herbert Kociemba said:
> 
> 
> > We will get "only" about 2^31 different valid cubes by the built in random generator. This is about 4.96505*10^-9 percent of the cube space....
> ...


... this guy.


----------



## hawkmp4 (Nov 2, 2009)

This is starting to get a little over my head but I'm not understanding how you can say that a small number of outputs is not a problem. It MIGHT not be a problem, it might be. It depends on whether the states we can generate are a representative sample of the entire set of states. Correct?
So we'd need to check that. We can't say either way whether it's a problem yet.
It bugs me, yes. I'd like us to be able to generate all possible states. But if we can't, or we can't practically, a representative sample will satisfy me.


----------



## Stefan (Nov 2, 2009)

hawkmp4 said:


> This is starting to get a little over my head but I'm not understanding how you can say that a small number of outputs is not a problem. It MIGHT not be a problem, *it might be*.


There, Michael, see?



hawkmp4 said:


> It depends on whether the states we can generate are a representative sample of the entire set of states. Correct?


No that doesn't depend. If the scrambler doesn't generate a representative sample of the entire set of states, then *that* might be a problem. The small number of outputs *is not*.

And Michael: Do you think I'm trying to prove the WCA scrambler to be good, maybe because I came up with it and want to "defend" it/myself? Not at all the case. I want it analyzed and will gladly accept any result. As usual I take the agnostic middle ground. I'm not saying it's good, I'm not saying it's bad. I just don't like people saying it's bad for flawed reasons. I'd just as much fight flawed points arguing that it's good.


----------



## hawkmp4 (Nov 2, 2009)

StefanPochmann said:


> hawkmp4 said:
> 
> 
> > This is starting to get a little over my head but I'm not understanding how you can say that a small number of outputs is not a problem. It MIGHT not be a problem, *it might be*.
> ...


I won't flatter myself, I'm assuming when you say Michael you're addressing Michael Gottlieb.

Well, I think having a small amount of outputs can definitely be a problem. We could have a scrambler that generates 500 states. They might fit the expected distributions of oriented and permuted pieces perfectly, that wouldn't be the problem. Granted, I don't think that will be an issue with the numbers we're talking about here.


----------



## Stefan (Nov 3, 2009)

hawkmp4 said:


> I won't flatter myself, I'm assuming when you say Michael you're addressing Michael Gottlieb.


Oops. Yeah. That happens if you guys are hiding.



hawkmp4 said:


> Well, I think having a small amount of outputs can definitely be a problem. We could have a scrambler that generates 500 states.


I agree:


StefanPochmann said:


> What's important is the quality of the scrambling, not the number of possible outcomes *(if the latter is reasonable (so don't say I said a megaminx scrambler with ten possible outcomes can be good))*.


----------



## qqwref (Nov 3, 2009)

StefanPochmann said:


> You can and apparently do think of the centers solely as a frame, a reference defining where all other pieces have to move to. But that's just one perspective.


You say "just one perspective" as if there are other meaningful ones from the context of speedsolving. There aren't. Nobody fast holds an edge/corner fixed and moves the centers and the rest of the pieces around it. The ending position after the scramble, relative to the centers, is all that is important.



StefanPochmann said:


> I analyzed all 2^70 possible scrambles. I am not analyzing some outputs of the system.


You're saying you literally looked at all 2^70 (~10^21) things? How did you manage that? (By the way, you did not state this when you posted the results, so it isn't fair to criticize me for not knowing this.) This does change the interpretation of the results you posted.



StefanPochmann said:


> People shouldn't worry about the quality of our scrambles purely because of a small number of possible outcomes.


I've said many times, though, it is not the quality of single scrambles I am concerned with, but the quality of the algorithm that generates those scrambles. The small number of outcomes does not tell us there is anything wrong with individual scrambles, just that the algorithm which produces scrambles is provably not uniformly random with regard to Megaminx positions.



StefanPochmann said:


> And Michael: Do you think I'm trying to prove the WCA scrambler to be good, maybe because I came up with it and want to "defend" it/myself? Not at all the case.


Actually this IS the vibe I am getting from your posts. You are attacking my wording more than anything else... when I take shortcuts and write "scrambles" instead of "distribution of scrambles", or "is a problem" for "is something that introduces an element of doubt into the scrambler's macroscopic randomness", you attack the philosophy you think the words are describing, even though I have clearly stated exactly what I mean several times. You seem to be arguing even when you agree with my point, just because you think the way I said things might be misinterpreted. To be honest, I feel this is more the behavior of someone who is desperately trying to defend something they made than the behavior of someone who agrees with my point that several characteristics of the scrambler signify that it needs to be looked at more carefully before we can assume it is unbiased. Even if you are not consciously trying to defend your scrambler, insisting that you are only arguing against me for the purpose of correcting my errors in wording is an arrogant lie.



StefanPochmann said:


> hawkmp4 said:
> 
> 
> > This is starting to get a little over my head but I'm not understanding how you can say that a small number of outputs is not a problem. It MIGHT not be a problem, it might be.
> ...


I have stated many times that I mean exactly this. The small number of outputs signifies that we cannot assume the scrambler is unbiased without actually examining it. There is no excuse for why you keep deliberately misinterpreting my statements.



StefanPochmann said:


> But you and hawk still somewhat give the impression that a small number of outcomes is a problem.


I have stated many times exactly what I mean. There is no excuse for why you keep deliberately misinterpreting my statements.


If you want to keep talking, go ahead, but I'm done. This topic is devolving into deliberate failure to communicate, condescending remarks, and arrogance. It looks like you are not trying to make a point, but just to be insulting and annoying. If you want to have an actual argument, let me know.


----------



## hawkmp4 (Nov 3, 2009)

My apologies, I missed that statement. Then yes, I agree with your last post.


----------



## Stefan (Nov 3, 2009)

qqwref said:


> You're saying you literally looked at all 2^70 (~10^21) things?


Yes. That's what I said. Can't you just trust me a bit?



qqwref said:


> How did you manage that?


Same way I for example analyzed all 3x3x3 scrambles up to 100 moves before: Dynamic programming. See the code (still available on the page) if you're interested, and I can explain it further if necessary (though it's really rather easy, I even used a static 3-dimensional table rather than using and reusing 2-dimensional ones, because I find it cleaner that way).



qqwref said:


> By the way, you did not state this when you posted the results, so it isn't fair to criticize me for not knowing this.


Well I presented it as the probabilities of where the centers end up. Of course I wouldn't have done that if I had only analyzed a subset, that would just be wrong. At least I would've used past tense and provided details like the used sample size. If I say I show you the probabilities, you ought to assume that I'm showing you the exact ones (except negligible rounding) and not some randomized experimental results. Also, I did mention it in the program.



qqwref said:


> Actually this IS the vibe I am getting from your posts.


Really not the case. That I created it is a coincidence. I'm defending it against what I consider flawed arguments because I don't like flawed arguments, not because I made it. Why do I defend the Cube Explorer scrambling the same way? I didn't make that one. Explain? And remember the Matyas issue where I quite a bit criticized both his believers and his doubters for their flawed arguments for or against him? I'm a proponent of solid reasoning, that's all. Btw, it's not nice to call my arguments stupid, me arrogant and a liar. I know I've been somewhat gruff, too, but I don't think I went nearly that far and explicitly personal.



hawkmp4 said:


> My apologies, I missed that statement. Then yes, I agree with your last post.


No worries, I might've edited that in after you read the post. And just like you said, it's clear we're really not talking about *that* small numbers anyway.

Done, too (until I have new analysis results or am asked something).


----------



## hawkmp4 (Nov 3, 2009)

Surely this could be of some use, yes? If we're concerned about enough scrambling states.


----------



## cuBerBruce (Nov 3, 2009)

Herbert Kociemba said:


> Motivated by the discussion here I took a closer look at the Assembler Code of the Random function Random(N) of the Cube Explorer in Delphi6, which computes a pseudorandom integer between 0 and N-1. RandSeed is a 32 bit integer variable and has 2^32 different values. On entry, the EAX register contains N.
> I added some comments
> 
> XOR EBX, EBX
> ...



I have somewhat wondered about the actual "randomness" of Cube Explorer's random states.

Some things I note about the procedure. Each internally generated cube (legal or illegal state) uses 4 random numbers. 4 is a factor of the number of states of the PRNG (2^32). That means that once we pick an initial seed, (and assuming the PRNG is only being used to generate random cubes), then the sequence of internally generated cubes will repeat themselves after 2^30 cubes (or approximately every 2^29 legal cubes). We are locked into a cycle that will only generate 1/4 of the possible cubes that we could get (unless we reseed, or "waste" a PRNG value once in awhile, or do something similar). I would prefer the number of PRNG values used to determine a cube to be relatively prime to the number of PRNG states.

My second comment has to do with the throwing away of the illegal cubes. Approximately half the generated cubes are thrown away. Clearly there will be streaks where a number of consecutive cubes are thrown away. Let's say there is a streak of three consecutive cubes that are thrown away, followed by one that is legal, so it's not thrown away. Note that there will be four different possible initial seeds that will result in this particular legal cube being returned. Let's say the next cube generated is also a legal cube. So it may very well be that only that one initial seed will result in that particular cube being the first cube returned. So the one cube will be four times more likely to be the first returned cube than the other one. Personally I would prefer a procedure that would always pick legal cubes, so none are thrown away, and thus avoiding this type of bias.


----------



## AvGalen (Nov 3, 2009)

I am probably one of very few people that has actually used the old notation for scrambles. It basically meant that 1 person that knew how that notation worked scrambled all megaminxes during a competition because nobody else could. It also took me minutes to perform and only about half of the time I got it right.

With the new method everyone can help scrambling megaminx resulting in more competitors during tournaments, less burden on 1 scrambler and most importantly competitors actually have to solve the same scramble because it is so easy to perform it correct. (I pop megaminxes more often than I make scrambling mistakes). With the increased interest has come increased solving speed and I think we are getting to the point where we could change the format of megaminx to be average of 5 (with a possible cut-off time to prevent beginners to take to much time) much like 5x5x5.

For me the most important things about any scramble is that it is
1) "random enough"
2) the same for all competitors

Getting a perfectly random stated generated scramble with an optimal generated scramble-string would be ideal for every puzzle, but it isn't going to happen for megaminx. The current method Just Works (tm).

The thing I am most interested in are the exclusive double turns. They seem to reduce randomness of scrambles (possible number of positions) without a good need for that. The only sources I could find about this choice are the following two quotes and they are both not very scientific (doesn't make them less true though). The first only mentions a belief without any further explanation, the second takes the extreme of only 10 moves in the example (with only one double turn). I have tested the following:
1) Perform the scramble as stated, alternating y and y' rotations at the end of each line 
2) Perform the scramble as stated, alternating y and y' rotations at the end of each line but also alternating between single and double turns (a++b++c++d++e++f++g++h++ becomes a++b++c+d+e++f++g+h+)
The first had one pair build after the entire scramble, the second didn't have any pairs build at all. Both looked like they had no solved corners or edges at all. Both had 3 faces where 1 cross-edge was attached, but not solved (like FRB D would do to the D-face on a cube

I realise that this isn't a good scientific analysis, but for me it is Good Enough (another tm). I would like to know in more detail why Stephan thinks exclusive double turns are a good idea. Changing the scrambler to include quarterturns would greatly increase the amount of possible results (from 2^70 to 4^70 or 10^42 if my math is not mistaken not taking rotations into account) and didn't significantly increase the difficulty/accurarcy/time of scrambling.

The two quotes I mentioned earlier:


> Notice that "++" means two-fifth turns clockwise (in the direction of the arrow) and "--" means two-fifth turns counterclockwise (against the direction of the arrow). It's intentional that there are only two-fifth turns and no one-fifth turns, I believe this improves the scramble quality.


 


> Now an improvement...
> 
> I thought I use many more "two fifth" turns than "one fifth" turns when I scramble for practice, and I just checked it and it turns out I do 2/5 turns *exclusively*. And they look better, too. Try just the first line of the above scramble:
> R- D- R++ D+ R+ D- R- D- R+ D-
> ...


----------



## Stefan (Nov 3, 2009)

Arnaud, if I remember correctly that first quote of mine was a result of the second, particularly the ending part of that one where I was delighted that sixty 2/5 turns had broken up all pairs. I think with mixed 1/5 and 2/5 I usually had more pairs or even unscrambled areas left at this point, presumably because 1/5 turns didn't _"get me around"_ much. But it wasn't a thorough analysis and I don't have statistics to support it.

I just adapted my center-analysis to compare only2/5, only1/5 and mixed. Interestingly, only2/5 and only1/5 perform exactly the same in this respect. And mixed performs much better, which is why I show the results of not 70 but only 15 moves scrambling, that's shortly before mixed flatlines (when rounding the results like I do).

only2/5:

```
DLB   DLF   DRB   URF   ULF   B     F     U     URB   DRF   ULB   D       min   max    stdDev sum
     pink 10.236 10.812 9.302 6.506 6.006 9.668 7.181 4.749 5.933 10.599 6.250 12.759   4.749-12.759  2.419  100.0
    blue2 10.812 10.236 9.302 5.933 6.006 10.599 6.250 4.749 6.506 9.668 7.181 12.759   4.749-12.759  2.419  100.0
   orange 9.302 9.302 3.784 6.006 14.160 5.713 11.359 12.366 6.006 5.713 11.359 4.932   3.784-14.160  3.256  100.0
   purple 6.506 5.933 6.006 10.236 9.302 7.181 9.668 12.759 10.812 6.250 10.599 4.749   4.749-12.759  2.419  100.0
turquoise 6.006 6.006 14.160 9.302 3.784 11.359 5.713 4.932 9.302 11.359 5.713 12.366   3.784-14.160  3.256  100.0
   green2 9.668 10.599 5.713 7.181 11.359 6.470 10.052 8.682 6.250 7.974 8.545 7.507   5.713-11.359  1.732  100.0
    green 7.181 6.250 11.359 9.668 5.713 10.052 6.470 7.507 10.599 8.545 7.974 8.682   5.713-11.359  1.732  100.0
    white 4.749 4.749 12.366 12.759 4.932 8.682 7.507 9.302 12.759 8.682 7.507 6.006   4.749-12.759  2.896  100.0
     blue 5.933 6.506 6.006 10.812 9.302 6.250 10.599 12.759 10.236 7.181 9.668 4.749   4.749-12.759  2.419  100.0
      red 10.599 9.668 5.713 6.250 11.359 7.974 8.545 8.682 7.181 6.470 10.052 7.507   5.713-11.359  1.732  100.0
    brown 6.250 7.181 11.359 10.599 5.713 8.545 7.974 7.507 9.668 10.052 6.470 8.682   5.713-11.359  1.732  100.0
   yellow 12.759 12.759 4.932 4.749 12.366 7.507 8.682 6.006 4.749 7.507 8.682 9.302   4.749-12.759  2.896  100.0
```

only1/5:

```
DLB   DLF   DRB   URF   ULF   B     F     U     URB   DRF   ULB   D       min   max    stdDev sum
     pink 6.470 7.974 5.713 10.052 11.359 10.599 6.250 7.507 8.545 9.668 7.181 8.682   5.713-11.359  1.732  100.0
    blue2 7.974 6.470 5.713 8.545 11.359 9.668 7.181 7.507 10.052 10.599 6.250 8.682   5.713-11.359  1.732  100.0
   orange 5.713 5.713 3.784 11.359 14.160 9.302 6.006 4.932 11.359 9.302 6.006 12.366   3.784-14.160  3.256  100.0
   purple 10.052 8.545 11.359 6.470 5.713 6.250 10.599 8.682 7.974 7.181 9.668 7.507   5.713-11.359  1.732  100.0
turquoise 11.359 11.359 14.160 5.713 3.784 6.006 9.302 12.366 5.713 6.006 9.302 4.932   3.784-14.160  3.256  100.0
   green2 10.599 9.668 9.302 6.250 6.006 10.236 6.506 12.759 7.181 10.812 5.933 4.749   4.749-12.759  2.419  100.0
    green 6.250 7.181 6.006 10.599 9.302 6.506 10.236 4.749 9.668 5.933 10.812 12.759   4.749-12.759  2.419  100.0
    white 7.507 7.507 4.932 8.682 12.366 12.759 4.749 9.302 8.682 12.759 4.749 6.006   4.749-12.759  2.896  100.0
     blue 8.545 10.052 11.359 7.974 5.713 7.181 9.668 8.682 6.470 6.250 10.599 7.507   5.713-11.359  1.732  100.0
      red 9.668 10.599 9.302 7.181 6.006 10.812 5.933 12.759 6.250 10.236 6.506 4.749   4.749-12.759  2.419  100.0
    brown 7.181 6.250 6.006 9.668 9.302 5.933 10.812 4.749 10.599 6.506 10.236 12.759   4.749-12.759  2.419  100.0
   yellow 8.682 8.682 12.366 7.507 4.932 4.749 12.759 6.006 7.507 4.749 12.759 9.302   4.749-12.759  2.896  100.0
```

mixed:

```
DLB   DLF   DRB   URF   ULF   B     F     U     URB   DRF   ULB   D       min   max    stdDev sum
     pink 8.334 8.334 8.332 8.333 8.334 8.333 8.334 8.334 8.333 8.333 8.334 8.334   8.332-8.334  0.001  100.0
    blue2 8.334 8.334 8.332 8.333 8.334 8.333 8.334 8.334 8.333 8.333 8.334 8.334   8.332-8.334  0.001  100.0
   orange 8.332 8.332 8.338 8.334 8.333 8.332 8.334 8.332 8.334 8.332 8.334 8.332   8.332-8.338  0.002  100.0
   purple 8.333 8.333 8.334 8.334 8.332 8.334 8.333 8.334 8.334 8.334 8.333 8.334   8.332-8.334  0.001  100.0
turquoise 8.334 8.334 8.333 8.332 8.338 8.334 8.332 8.332 8.332 8.334 8.332 8.332   8.332-8.338  0.002  100.0
   green2 8.333 8.333 8.332 8.334 8.334 8.334 8.333 8.334 8.334 8.334 8.333 8.334   8.332-8.334  0.001  100.0
    green 8.334 8.334 8.334 8.333 8.332 8.333 8.334 8.334 8.333 8.333 8.334 8.334   8.332-8.334  0.001  100.0
    white 8.334 8.334 8.332 8.334 8.332 8.334 8.334 8.332 8.334 8.334 8.334 8.334   8.332-8.334  0.001  100.0
     blue 8.333 8.333 8.334 8.334 8.332 8.334 8.333 8.334 8.334 8.334 8.333 8.334   8.332-8.334  0.001  100.0
      red 8.333 8.333 8.332 8.334 8.334 8.334 8.333 8.334 8.334 8.334 8.333 8.334   8.332-8.334  0.001  100.0
    brown 8.334 8.334 8.334 8.333 8.332 8.333 8.334 8.334 8.333 8.333 8.334 8.334   8.332-8.334  0.001  100.0
   yellow 8.334 8.334 8.332 8.334 8.332 8.334 8.334 8.334 8.334 8.334 8.334 8.332   8.332-8.334  0.001  100.0
```

The program:


Spoiler





```
#-----------------------------------------------------------------------------
# This program analyses the WCA megaminx scrambler. See also:
# http://www.stefan-pochmann.info/spocc/other_stuff/tools/scramble_megaminx/
#
# So far it only computes the probability of where each center ends up. The
# results are shown on the above-mentioned page.
#
# Implementation:
#   1) Configure the scramble length and specify the puzzle (pieces+moves).
#   2) Compute the probability of center C ending up at position P after
#      M moves as $probability{C}{P}{M}. This is done using dynamic programming.
#   3) Evaluate the results.
#
# Note that this is *not* an analysis of a limited number of scrambles, for
# example a million scrambles. This is an analysis of *all* 2^70 possible
# scrambles.
#-----------------------------------------------------------------------------

use strict;

#--- Configure the scramble length
my $scrambleLength = 15;

#--- Specify where the centers start
my %origin = (
  white     => 'U',
  green     => 'F',
  turquoise => 'ULF',
  brown     => 'ULB',
  blue      => 'URB',
  purple    => 'URF',
  yellow    => 'D',
  green2    => 'B',
  pink      => 'DLB',
  blue2     => 'DLF',
  red       => 'DRF',
  orange    => 'DRB'
);

#--- Specify the effects of the moves on the centers (list the cycles)
my %effects = (
  'D++' => 'F ULB URF ULF URB + B DLF DRB DLB DRF',
  'R++' => 'U DLB F ULB DLF + URF B DRF URB D',
  'D+' => 'F ULF ULB URB URF + B DLB DLF DRF DRB',
  'R+' => 'U ULB DLB DLF F + URF URB B D DRF',
);
@effects{'R-','R--','D-','D--'} = map { join ' ', reverse split } @effects{'R+','R++','D+','D++'};

#--- Initialize the center-position probabilities before scrambling
my %probability;
for my $center ( keys %origin ) {
  $probability{$center}{$origin{$center}}{0} = 1;
}

#--- Spread the love (the probabilities through moves)
for my $numberOfMoves ( 1..$scrambleLength ) {
  for my $center ( keys %origin ) {
    for my $positionBeforeMove ( values %origin ) {
#      my @moves = $numberOfMoves%2 ? ( 'R++', 'R--' ) : ( 'D++', 'D--' );
#      my @moves = $numberOfMoves%2 ? ( 'R+', 'R-' ) : ( 'D+', 'D-' );
      my @moves = $numberOfMoves%2 ? ( 'R+', 'R++', 'R--', 'R-' ) : ( 'D+', 'D++', 'D--', 'D-' );
      for my $move ( @moves ) {
        my $positionAfterMove = applyMove( $move, $positionBeforeMove );
        $probability{$center}{$positionAfterMove}{$numberOfMoves} +=
          $probability{$center}{$positionBeforeMove}{$numberOfMoves-1} / @moves;
      }
    }
  }
}

#--- Evaluate the center-position probabilities after scrambling
print "          ";
printf "%-6s", $_ foreach values %origin;
print "  min   max    stdDev sum\n";
for my $center ( keys %origin ) {
  printf "%9s ", $center;
  my ( $min, $max, $stdDev, $sum ) = ( 100, 0, 0, 0 );
  for my $position ( values %origin ) {
    my $p = 100 * $probability{$center}{$position}{$scrambleLength};
    printf "%.3f ", $p;
    $min = $p if $p < $min;
    $max = $p if $p > $max;
    $stdDev += ( $p - 100/12 ) ** 2; 
    $sum += $p;
  }
  printf "  %.3f-%.3f  %.3f  %.1f\n", $min, $max, sqrt( $stdDev / 12 ), $sum;
}



#--- Helper for applying a move
sub applyMove {
  my ( $move, $position ) = @_;
  my $effects = join ' + ', map { "$_ $_" } split / \+ /, $effects{$move};
  return $effects =~ /\b$position\b (\w+)/  ?  $1 : $position;
};

#--- Helper for debugging
sub show {
  use Data::Dumper;
  print Dumper( @_ );
};
```



Granted, this doesn't necessarily tell much about the scrambling results, but it's something I could do quickly so I just did.


----------



## Stefan (Nov 3, 2009)

qqwref said:


> StefanPochmann said:
> 
> 
> > You can and apparently do think of the centers solely as a frame, a reference defining where all other pieces have to move to. But that's just one perspective.
> ...


Actually I think most of them do that. Every time using the WCA scrambler. Which is what we're talking about here.

I just realized I had forgotten a main reason for implementing the center-analysis first: I pretty much need to do it anyway. Cause the scrambler does move them around. I could consider the centers fixed but then I'd need to keep track of which sides I can move next and that just seems to be a complicated and unnatural way to hide the center movements. Also, other scramblers might move centers as well and I want to build an analyzer where you can easily just describe any scrambling method and then the analyzer does the statistics. So that we can easily compare different scrambling methods statistically. I want it to be capable of covering the WCA scrambler so I need to be general and cover the center movements and treating all pieces (including centers) the same way looks like a good and natural way to implement that.


----------



## AvGalen (Nov 3, 2009)

So based on your feeling and some handscrambling your conclusion is that only 2/5 is best because it moves pieces around more

But the first computer result indicates that 1/5 and 2/5 mixed is basically perfect (I would actually say *too* perfect. If I would get results like that I would think either "bug in code" or "there must be a logical explanation for this"). I might have some time tomorrow to look at the code but it depends on the number of questions my students will have for me as they have a big deadline coming up


----------



## Stefan (Nov 3, 2009)

AvGalen said:


> If I would get results like that I would think either "bug in code" or "there must be a logical explanation for this"


Well, I see two reasons:

1) Each value is spread to four different values rather than to just two. With every single move. So this alone spreads the probabilities much faster.
2) I haven't understood why only1/5 and only2/5 behave exactly the same (same values just on different places) but I imagine each has a pattern and the mixed scrambling has the advantage of being able to switch between the two patterns.

It's like driving twice as fast and being able to use shortcuts.

I did indeed have a bug in my program, btw, I specified the first cycle of D++ and D+ in the wrong direction. After fixing it I got the same results, though. Next I'll implement analyzing groups of pieces so I can find out where they move relative to each other, and the first test will be two centers. This should result in 60 outcomes and I'll test again with the false D++ specification, it should fail then.


----------



## Pedro (Nov 4, 2009)

I've read the whole thread, and I'm not a mathematician (spelling?) but I'm interested in the discussion. (probably because I solve the megaminx)

From what I understood, we can generate "only" 2^70 (or 10^21) positions with the current scramble. So if we use 80 moves, we can get ~10^24 positions. Is it still that bad? I wouldn't mind doing an extra line of scrambling. Or maybe 2 or 3. (10^30 possible positions)


Arnaud, I think alternating between 1/5 and 2/5 turns would lead to many mistakes. The way it is now is pretty cool, because I can do it almost on auto-pilot, just reading - or + and turning accordingly.

If I had 2 megaminxes, I could try doing the same scramble twice and seeing if they match, and how long it takes. Anyone can do it?

EDIT
Forgot to say that I check sometimes the scramble on cct, and is correct.
Also, just did a scramble in 36.15. Of course it's with my own minx, which I know pretty well and can move fast and nice.
Solved in 1:38.22.

Then I did a scramble with 100 moves...in 46.78. Guess I messed up a little bit on the first one. Solved in 1:52.09 (yeah, I suck).
Solve didn't feel harder (mom talking to me makes it harder). Will do another one.

Scramble: 47.96
Solve: 1:39.31


----------



## hawkmp4 (Nov 4, 2009)

Pedro said:


> I've read the whole thread, and I'm not a mathematician (spelling?) but I'm interested in the discussion. (probably because I solve the megaminx)
> 
> From what I understood, we can generate "only" 2^70 (or 10^21) positions with the current scramble. So if we use 80 moves, we can get ~10^24 positions. Is it still that bad? I wouldn't mind doing an extra line of scrambling. Or maybe 2 or 3. (10^30 possible positions)
> 
> ...


2^70/10^68 = 1.180591621*10^−47
2^100/10^68 = 1.2676506*10^−38
Adding 3 more lines of scrambling still results in a tiny fraction of one percent of all the possible states. An insignificant gain really.
In fact...
We'd need to do ~230 twists to get all the possible states of the megaminx. Or, 23 lines of scrambling. And that's a lower bound- that assumes that every possible sequence generates a different state.
This is all assuming we could actually generate that many sequences. Currently we're limited by the PRNG of whatever language we're using for the scrambler program. Java, for example, only provides 48 random bits. In other words, it can provide only 2^48 different scrambles at most. So, the scrambling method is not what's holding us back. It's our random number generation.


----------



## spdqbr (Nov 4, 2009)

hawkmp4 said:


> *snip*
> This is all assuming we could actually generate that many sequences. Currently we're limited by the PRNG of whatever language we're using for the scrambler program. Java, for example, only provides 48 random bits. In other words, it can provide only 2^48 different scrambles at most. So, the scrambling method is not what's holding us back. It's our random number generation.



This is not actually the case. If we were relying on the RNG to generate one of the possible states of the megaminx, and then finding an algorithm to generate that state, this would be true.

Since we are not generating random states, but a random sequence (ok, pseudo random) of moves we are not limited in this fashion. If we assume that all states achievable on the megaminx are achievable by use of R++, R--, D++, D--, Y and Y' then all states would be achievable with a scrambler that generates a sufficiently long sequence of moves.

Example, consider a 3x3 cube. We have 6 sides and 3 possible ways of turning each side. You could generate all possible states of a 3x3 using 2 dice if you roll them enough times (1st die chooses the side, second die chooses the direction of the turn 1-2 clockwise, 3-4 half turn, 5-6 anti-clockwise). So even though this pair of dice only has 36 possible outcomes (7 bits), and really only 18 because of the binning of the second die (6 bits), it is possible to generate all possible states of the cube if you roll them enough times.


----------



## hawkmp4 (Nov 4, 2009)

I'm very sleepy at the moment and largely incoherent, but I'll have a go at this and clarify in the morning if needed.
My understanding though, is that a RNG is not like a large pool of dice, but a large pool of results of previously rolled dice. We can use however many dice rolls results we want, and if we go over the total number of dice roll results in memory it 'wraps' around to the beginning of the sequence of dice roll results.
Is that correct?
If so, that would still mean that we could only generate a limited amount of unique sequences of dice roll results, since we have a limited number of dice roll results and must start 'reusing' the sequence at the beginning.


----------



## AvGalen (Nov 4, 2009)

That is correct hawkmp4. RNG's start with a seed/randomise call that generates a random number in the case of Java that could apparently generate 2^48 random FIRST numbers. All numbers after that seem random, but are actually fixed based on the first number. You can test this by calling a random fucntion without seeding/randomising it first. There is enough information in this thread to learn more and otherwise you can surely google about it.

I don't care if our scrambler will never genereate all possible states because there are never going to be enough competitions to use all of them anyway. The important thing for me is if the scrambles are random enough (a 4x4x4 scramble with always OLL parity or a 3x3x3 scramble that never scrambles the white-green edge would obviously be bad) and if they are easy enough to perform quickly so competitors are very likely to get the same scramble

I disagree with Pedro that adding 1/5th turns would be a lot slower or more problematic to perform. Obviously I would prefer simpler scrambles but if the quality can be improved adding 1/5th turns would be worth it

Stefan, can you recommend a good, downloadable, free debugger for that code?


----------



## spdqbr (Nov 4, 2009)

I hadn't thought of it from that angle.

However, even if we are only capable of generating 2^48 first numbers, we can pick any of those 2^48 first numbers to be our first twist. Further we can twist for as long or as short as we wish. You're not guaranteed to hit every state, but surely you could hit more than 2^48 states this way...

I agree it's not necessarily important have every state as a possible outcome, but a representative distribution would be nice. 

I'm almost done recoding the scramble analyzer, just the tedious coding of the megaminx scramble tracking is left.. blech.


----------



## Stefan (Nov 4, 2009)

AvGalen said:


> a 4x4x4 scramble with always OLL parity [...] would obviously be bad


Indeed. I vote for one with always *no* OLL parity.



AvGalen said:


> Stefan, can you recommend a good, downloadable, free debugger for that code?


I just debug using print statements. If you need Perl (it's written in that) and are using Windows, I recommend ActivePerl.



spdqbr said:


> Further we can twist for as long or as short as we wish. You're not guaranteed to hit every state, but surely you could hit more than 2^48 states this way...


1. Not without an independent second source of randomness.
2. How are you going to determine for how long or short you twist?



AvGalen said:


> I have tested the following:
> 1) Perform the scramble as stated, alternating *y and y'* rotations at the end of each line





spdqbr said:


> If we assume that all states achievable on the megaminx are achievable by use of R++, R--, D++, D--, *Y and Y'*


Where's the Y/Y' still coming from? The WCA uses U/U'.


----------



## Stefan (Nov 4, 2009)

StefanPochmann said:


> I did indeed have a bug in my program, btw, I specified the first cycle of D++ and D+ in the wrong direction. After fixing it I got the same results, though. Next I'll implement analyzing groups of pieces so I can find out where they move relative to each other, and the first test will be two centers. This should result in 60 outcomes and I'll test again with the false D++ specification, it should fail then.


Bonus points if you see why it didn't fail so far and why I expect it to fail that planned test.


----------



## Stefan (Nov 4, 2009)

StefanPochmann said:


> spdqbr said:
> 
> 
> > Further we can twist for as long or as short as we wish. You're not guaranteed to hit every state, but surely you could hit more than 2^48 states this way...
> ...


Ok I somewhat take that back, I can see now how to do it. Would like to see how *you* would do it, though.


----------



## hawkmp4 (Nov 4, 2009)

If we allowed the scramble length to vary we could get more states. Say, generating scrambles with either 70 or 80 moves would double the number of unique scrambles we could get over a strictly 70 move scramble, if we're limited by a 48 bit PRNG.
EDIT: I take that back. We'd need to use one random bit to decide how long the scramble would be, cancelling any gain.


----------



## spdqbr (Nov 4, 2009)

StefanPochmann said:


> StefanPochmann said:
> 
> 
> > spdqbr said:
> ...



While that seems a little high-handed, I'll I've it a go after work when I don't have to type on my phone...


----------



## Stefan (Nov 4, 2009)

spdqbr said:


> StefanPochmann said:
> 
> 
> > Ok I somewhat take that back, I can see now how to do it. Would like to see how *you* would do it, though.
> ...


Just wanna see what you had in mind or come up with without being influenced by reading what I have in mind. (If you meant the marking of "you", that was not for _"I don't think *you* can do it"_ but to clearly distinguish it from _"one"_ which I thought it can be interpreted as in English. In other words, I just wanted to address you specifically). 

I did that analysis of center pairs now. As expected, the result was 60 outcomes (also as expected, the wrong D+/D++ definition resulted in 132 outcomes). The results of 70 moves scrambling:


```
only2/5:
number of outcomes:  60
minimum probability: 1.66131
maximum probability: 1.67202
standard deviation:  0.00248
sum (sanity check):  100.00000

only1/5:
number of outcomes:  60
minimum probability: 1.66131
maximum probability: 1.67202
standard deviation:  0.00248
sum (sanity check):  100.00000

mixed:
number of outcomes:  60
minimum probability: 1.66667
maximum probability: 1.66667
standard deviation:  0.00000
sum (sanity check):  100.00000
```

I also just created a sourceforge project for this. I made it general, to cover both scramblers and scramble analyzers, and for all our puzzles. My program is in the subversion repository already. If someone wants to join (for example Daniel, so we don't lose your work again), let me know.

Project: http://sourceforge.net/projects/cubescrambles/
SVN trunk: http://cubescrambles.svn.sourceforge.net/viewvc/cubescrambles/trunk/
My current program: http://cubescrambles.svn.sourceforg...k/megaminx/analyzer.pl?revision=3&view=markup

Next are edges and corners...


----------



## spdqbr (Nov 5, 2009)

Ah, the ambiguities of text on a page, gotta love it. Sorry for the mix up. I'll let you know what I come up with.


----------



## spdqbr (Nov 9, 2009)

Ok, now that the house guest is gone I can do a little calculating.

The first thing that comes to mind when trying to get more scrambles out of a prng by varying length is to vary the length on a set schedule. Example:
Assume our PRNG has a period of 2^48-1 bits. We can define a wca compliant megaminx scramble in only 77 bits:
00101001111 -> R++ R++ D-- R++ D-- R++ R++ D-- D-- D-- U' and so on in this fashion.

The most obvious schedule, however impractical, is to increase the length of the scrambles by 11 bits every 2^48+1 moves (though changing length at any interval length which is not commensurate with the period of the PRNG would do). This can be done for quite a long time before all possible scrambles of all possible lengths have been exhausted. In fact, for every possible scramble length there are 2^48 new scrambles you can get from the prng. The scrambles get very long though, and it doesn't solve the real-world problem, because we're never going to make it through all of the first 2^48 scrambles to get to the extended ones.

I haven't thought of a more clever solution using a single PRNG at this point, but if I were coding this, I would suggest perhaps using two instances of a prng, one for the scrambles and one to determine scramble length. This would be extremely easy in many languages available.

Stefan, I'm quite curious to hear your solution, as I'm sure there should be a more elegant way to do it and I'd wager you've found it.

Alternatively, we could just use the Mersenne Twister which should provide us with sufficient scrambles for as long as we need them.


----------



## hawkmp4 (Nov 9, 2009)

Using the variable length scrambles seems to be just a workaround to me, and besides, are there any other puzzles that don't have optimal scrambles that have variable length scrambles?
Also Dan, you saw for yourself with your analysis in the yahoo group topic that varying the scramble length lowered the confidence quite a bit. Is that something we should be taking into account, too?
How hard would implementing this Mersenne Twister algorithm into our scrambles for competition use? Also, can it be used easily in the myriad of online scramblers? I really don't know, I just think those are important questions. For example, if it wouldn't be hard to implement into cube explorer I think it would be a good idea too, but it may be a deterrent if it couldn't be used in javascript.
Just my thoughts. Off to bed now.


----------



## spdqbr (Nov 9, 2009)

You're correct hawk, I did find that variable length scrambles were bad. I'm curious what would happen if I re-ran the test with the minimum length being the lowest length which was deemed "good enough" It seems reasonable that if 45 moves gets us generally scrambled (on a 3x3) and we use variable length scrambles in the range 45-55 moves that should also be sufficient. The main result I was concerned with back then was could you decrease the required number of moves to randomize a cube by allowing variable length scrambles. (i.e. 45 was enough, but would selecting from a range between 30 and 40 get you there, the answer to which seems to be "no").

But yes, this feels like a cludge more than anything, I agree.

As for the Mersenne Twister, it seems relatively simple to implement. You can find an implementation in javascript here which I have yet to try. I did seamlessly incorporate this implementation in Java into my scramble generator with almost no work. Perhaps I should run the cube scramble analysis using this and other PRNGS and see what we see.

It looks like it's been implemented in just about every reasonable language.

Is this overkill? Probably. But I would like to pose the question: If it is exceedingly easy to implement in the scramblers and it gives us access to a larger portion of the puzzles' scramble-spaces, why not use it?


----------



## blade740 (Nov 9, 2009)

I agree with that: a better PRNG can only be a good thing.


----------



## Stefan (Nov 9, 2009)

spdqbr said:


> I haven't thought of a more clever solution using a single PRNG at this point, but if I were coding this, I would suggest perhaps using two instances of a prng, *one for the scrambles and one to determine scramble length*.


Or... if you're going to use another PRNG instance anyway, you could just use both directly for the scrambles. For example always use 35 bits from the first instance and 36 from the second.



spdqbr said:


> Stefan, I'm quite curious to hear your solution, as I'm sure there should be a more elegant way to do it and I'd wager you've found it.


Well, the other reason I asked for your idea was that mine wasn't particularly exciting. I summarize it as _"reuse and reinterpret"_. For example reuse the same PRNG output from the first scramble to produce a second scramble by using the old WCA scrambler. Or rotate the data, e.g., instead of using bits 1-70 use bits 2-70 followed by bit 1, then 3-70 followed by 1-2, etc. Or flip every second bit. Or xor with a chaotic bitmask.

Another idea is to really ask for a number of bits (Java's PRNG does support this) and have that number relatively prime to the number of bits in the cycle. So with a cycle of 2^48 values of 32 bits each, you could get 32*2^48 different scrambles rather than "only" 2^48.

But in my opninion this is all just overkill, provides no practical advantage and only makes things more complicated and error-prone.


----------



## hawkmp4 (Nov 9, 2009)

spdqbr said:


> You're correct hawk, I did find that variable length scrambles were bad. I'm curious what would happen if I re-ran the test with the minimum length being the lowest length which was deemed "good enough" It seems reasonable that if 45 moves gets us generally scrambled (on a 3x3) and we use variable length scrambles in the range 45-55 moves that should also be sufficient. The main result I was concerned with back then was could you decrease the required number of moves to randomize a cube by allowing variable length scrambles. (i.e. 45 was enough, but would selecting from a range between 30 and 40 get you there, the answer to which seems to be "no").
> 
> But yes, this feels like a cludge more than anything, I agree.
> 
> ...


I agree with everything you've said here. I'll take your word that it's easy to implement. Given that, if it's simple to do, and like you said gives us more states of the puzzle, I see no reason not to improve our scrambling programs.


----------



## spdqbr (Nov 28, 2009)

3 weeks later...

A series of circumstances involving:
a) A room full of trampolines
b) An artificial mountain
c) A realization that CCT has done most of the hard work for me

Have all converged in such a way that I could finally finish coding the cube stats program. (The first two left me too battered and bruised to do anything else, while the third made me stop staring at my megaminx with a sense of dread.)

The code I used is all jarred up and attached to the post (accessible through winzip, 7zip, or any other zip program you like). If you just want to run the same test I did, drop to a command line and type
java -jar CubeStats.zip

I have also included all of the source code which anyone is free to use and free to modify (and obligated to tell me of mistakes ). Stefan, feel free to add this to your code repository if you like. I have not thoroughly tested most of the scramble tracking algs. They were ripped out of CCT and slightly modified. (Thanks CCT!)

I have tried to make everything extensible enough to accommodate any other puzzles or scramble generators people would like to throw at it, but I'm sure there's plenty I missed.

The biggest problem I see currently is that with even numbered cubes or generally puzzles without centres, the scrambles look artificially better than they are. This is because for puzzles with centres we can be sure to orient the puzzle before checking the scramble. I thought of defining orientation by a single corner, but I'm not sure how effective this would be. I'm also too lazy to do it right now.

I have also added the option to use the Mersenne Twister for generating scrambles, though I haven't done much testing with it so far. A cursory examination leads me to believe that the MT alg is actually worse for quickly producing random scrambles, but that's without any in depth analysis.

Last but not least, here is the output I got from executing the standard CubeStats.zip as is:

```
3x3x3 - 100000 trials 
20		443.1620148755661
25		211.50820611587486
30		147.651557995157
35		110.43133676349672
40		116.5568456069386
45		125.9131939241523
50		125.83197690233523
55		115.44001564684383

4x4x4 - 100000 trials 
40		119.66390614877406
45		119.06296470866742
50		128.85403943323186
55		124.14081765759262
60		123.39170573210444
65		117.62148576277836
70		121.84658921918096
75		115.44957868000778

WCA Megaminx Scrambler - 100000 trials 
70		720.509295011079
80		511.63070040095505
90		370.6176689892984
100		270.74721971613343
110		197.88530030760757
120		152.74470563906405
130		125.54038855738122
140		107.02766036586652
150		96.1607554902514
160		92.65781383996249
170		90.6069948685559
180		90.85247570597147

Megaminx Non-WCA - 100000 trials 
70		567.6484820433203
80		371.2118931711233
90		251.11136711335263
100		176.45536596516814
110		129.568316248223
120		106.91459055425659
130		99.10904699705459
140		90.8678258737009
150		85.95969286692603
160		86.15861223825169
170		89.65422729883069
180		87.531548963407

Megaminx Old Style - 100000 trials 
70		134.21530108955525
80		106.06835615725058
90		87.26749114317384
100		88.48152263108986
110		89.10289513129383
120		87.06746563398393
130		89.68395593831148
140		88.02043479965928
150		86.79848940122729
```


----------



## jfly (Nov 28, 2009)

spdqbr said:


> a) A room full of trampolines


Boy, am I glad I subscribed to this thread. Sky High was totally awesome. It was great to see you last weekend!



> c) A realization that CCT has done most of the hard work for me


----------



## spdqbr (Nov 28, 2009)

Yup, it was a blast. I totally screwed up one of my fingers though, hopefully it'll stop hurting in time for Utah... You know, where the cool people are going


----------

