# Kociemba explained verbally?



## Kyle Byrne (Oct 14, 2013)

Hi Guys, I'm currently writing a paper on the maths of the cube for 'lay' people' who don't have a high level of maths knowledge. One of the things I wanted to do was explain how cube solvers work , so I wondered if anyone could explain using words only (I'll let you have a couple of formulas but keep it to a minimum) how Kociembas algorithm actually works.


----------



## DrKorbin (Oct 14, 2013)

Isn't it enough?


----------



## Kirjava (Oct 14, 2013)

http://kociemba.org/twophase.htm


----------



## ben1996123 (Oct 14, 2013)

if you dont understand it yourself then you prolaby arent going to be very good at explaining it to other people


----------



## Lucas Garron (Oct 15, 2013)

Instead of G1, let's ignore group theory and use the first two layers as an analogy.

Two-Phase

The core of the 2-phase approach is this:


Solve the cube into a state with a certain property.
Solve the rest of the cube.

If we select F2L as our property, we get:


Solve the first two layers.
Solve the last layer.

If you solve F2L with the fewest moves possible, then solve the resulting LL case optimally, you'll end up with a decently short solution. However, there might be a shorter solution: in that case, the first phase might take more moves, but get us an easy LL case (or an LL skip!).

Iterative Deepening (for Phase 1)

Let's say our first solution took 15 moves for F2L and 11 moves for LL. That means that our cube takes at most 15 + 11 = 26 moves to solve.

So let's try looking at _all_ the solutions that take 15 moves for F2L. If any of them give us an LL that takes fewer than 11 moves, we can find a shorter solution. In fact, we can extend this and keep looking. Here's our plan:


Try all the solutions that take 15 moves for F2L and fewer than 26 - 15 = 11 moves for LL.
Try all the solutions that take 16 moves for F2L and fewer than 26 - 16 = 10 moves for LL.
Try all the solutions that take 17 moves for F2L and fewer than 26 - 17 = 9 moves for LL.
...
Try all the solutions that take 25 moves for F2L and fewer than 26 - 25 = 1 moves for LL.
Try all the solutions that take 26 moves for F2L and fewer than 26 - 26 = 0 moves for LL.

If there is a solution that takes fewer than 26 moves for the entire cube, F2L clearly has to be finished in at most 26 moves. Therefore, our optimal overall solution has to be in one of these cases.

Now let's say that along the way we find a solution that takes 17 moves for F2L and 8 moves for LL. Now we know that the entire cube can be solved in 17 + 8 = 25 moves. We abort the original 26-plan and continue our search with a new upper bound:


Try all the solutions that take 17 moves for F2L and fewer than 25 - 17 = 8 moves for LL.
Try all the solutions that take 18 moves for F2L and fewer than 25 - 18 = 7 moves for LL.
...
Try all the solutions that take 25 moves for F2L and fewer than 25 - 25 = 0 moves for LL.

If we keep going, we reach a point where any solution we haven't considered must be longer than the one we have -- so our solution is optimal.

For 3x3x3, we happen to know that the optimal solution must be at most 20 moves... but that's only because we know God's number. The approach works even if we don't know God's number.

The Domino Phase (a.k.a. Group Theory is Awesome)

Of course, Kociemba's algorithm doesn't solve F2L first. Once you've solved F2L, you can only do U moves without breaking up your progress, which is counterproductive.

Instead, Kociemba gets the cube into one of the states in "G1", which means it has the following properties:


All the corners are oriented (like in 3OP).
All the edges are oriented (like in 3OP).
All the middle layer edges are already in the middle layer.

From there, you can solve the cube by using only <U, D, F2, B2, R2, L2>. That is, you can basically pretend it's a 3x3x2 domino. (Try it!) (Aside: This means that you've put the cube into a _subgroup_ which is its own puzzle.)

The nice thing about this is that once the cube is in the G1/Domino, you can search for an optimal Domino solution.

More importantly, _every_ cube solution ends with some number of domino moves. That number might be 0, but there's a good chance that there is _some_ optimal solution that uses a few. This means we can run the two-phase search similar to before, except we can restrict ourselves to a domino search in the second half.

There are a lots of details, including:


Solving a cube into G1 is about as hard as solving the rest. This means that both parts of the search take roughly the same time. (If we tried to find an optimal solution using the F2L approach, we would have to spend way more time trying to solve F2L than a computer could ever handle. There are very few cube states where it's possible to find an optimal solution where F2L is solved early.)
Since G1 is a group, there is a lot of _symmetry_ in the search. This speeds things up by allowing you to search multiple things at once.
You can take lots of shortcuts. Let's say you already have a solution of 26 moves, and are considering a solution that has taken 18 moves already. If you know that it takes at least 10 more moves to solve the corners, then you can stop searching for that solution. There are efficient ways to do this by using _prune tables_. The symmetry of G1 also allows these prune tables to be very efficient.
If you run Kociemba for a short while, you can get a pretty short solution. If you run it for longer, you will quickly get some better solutions, and after a while verify the optimal solution. The choice of G1 really helps with this.
Solvers for other puzzles usually use a similar approach, except they use more phases. This WCA scrambler for 4x4x4 uses 4 phases to generate a scramble for a random state (but it doesn't continue to find an optimal solution).
Instead of iterative deepening, Kociemba implementations usually use IDA*, which is iterative deepening combined with a heuristic. Instead of going through the solutions blindly, the algorithm tries to go down path that look like they might have a shorter solution. This helps find shorter solutions earlier, which cuts down on the additional searching.

Real-World Kociemba.

Look at the reconstruction of Mats Valk's 5.55 WR. The scramble is:



> D2 U' R2 U F2 D2 U' R2 U' B' L2 R' B' D2 U B2 L' D' R2



Notice that the first half uses only <U, D, F2, B2, R2, L2>. This is because TNoodle uses Shuang Chen's min2phase to do the following:


Generate a random state.
Solve the state using Kociemba.
Invert the solution to get a scramble for the state.

The last half of the solution becomes the first half of the scramble, which is why the initial moves are in G1. And you can see that G1 is about half of the scramble (9 out of 19 moves).

You can see similar behaviour with solutions from ACube and Cube Explorer. ACube will even output a . at the transition.

(Human) Thistlethwaite

If you find this interesting, take a look at Ryan Heise's Human Thistlethwaite algorithm. The Kociemba algorithm is basically a reduced version of Thistllethwaite's original algorithm that runs well on modern computers. Reading through Ryan's explanation and trying a few solves gives you a feel for the kinds of things a computer does differently from a human.

Jaap's Puzzle Page also has a lot of details on computer cubing and approaches for other puzzles.


----------



## Carson (Oct 15, 2013)

Lucas Garron said:


> What Lucas said.



The best explanation I have seen on this forum.


----------



## cmhardw (Oct 15, 2013)

Lucas, that was clearly written, very interesting to read, and just an all around amazing post! Thank you for writing that!


----------



## Tim Major (Oct 15, 2013)

Thanks Lucas. I had never bothered to look into it as I wasn't really interested. Now I understand how it can generate solutions so ridiculous quickly. I'm going to look into how it solves it into G1 now. After that it can just generate the optimal solution due to the far lower number of turns to choose from and maximum movement, which seems pretty simple. I'll read into G1 now.


----------



## Kyle Byrne (Oct 15, 2013)

ben1996123 said:


> if you dont understand it yourself then you prolaby arent going to be very good at explaining it to other people


Which is why I'm asking for help...




Lucas Garron said:


> Instead of G1, let's ignore group theory and use the first two layers as an analogy....


Thanks very much Lucas that explanation was excellent


----------



## guusrs (Oct 15, 2013)

Nice to read, well done Lucas


----------



## Stefan (Oct 15, 2013)

If you want to explain to 'lay people' how cube solvers work, why go for Kociemba and not something simpler like Korf?

Nice explanation, Lucas, though some comments:



Lucas Garron said:


> Let's say our first solution took 11 moves for F2L and 15 moves for LL. That means that our cube takes at most 11 + 15 = 26 moves to solve.
> 
> So let's try looking at _all_ the solutions that take 15 moves for F2L. If any of them give us an LL that takes fewer than 11 moves, we can find a shorter solution.



You switched 11+15 to 15+11. 



Lucas Garron said:


> In fact, we can extend this and keep looking:
> 
> 
> Try all the solutions that take 15 moves for F2L and less than 26 - 15 = 11 moves for LL.
> ...



The upper bound should be updated. If in for example the 16+x search you find an x=8 solution, you should stop using 26 and instead start using 24.

Also, at least in Cube Explorer we usually don't go all the way but tell it to stop once it has found a let's say 21 or 20 moves solution, as the last rounds are quite slow and such solutions are very good already.



Lucas Garron said:


> If we tried to find an optimal solution using the F2L approach, we would have to spend way more time trying to solve F2L than a computer could ever handle.



I don't see why. Can you explain or point to an explanation/discussion?


----------



## qq280833822 (Oct 15, 2013)

Stefan said:


> I don't see why. Can you explain or point to an explanation/discussion?


That's because of the last layer contains only 62208 different states, Which means the size of the F2L space will be about 4.3*10^19/62208 = 7*10^14. It's actually too large for most of the personal computers to handle.


----------



## Akiro (Oct 15, 2013)

That was amazingly well written Lucas! Thanks!


----------



## Stefan (Oct 15, 2013)

qq280833822 said:


> That's because of the last layer contains only 62208 different states, Which means the size of the F2L space will be about 4.3*10^19/62208 = 7*10^14. It's actually too large for most of the personal computers to handle.



Well, we *can* optimally solve F2L in one phase, we can even solve the *entire cube* optimally in one phase. It's got to have to do with it being used as phase 1 of such a two-phase algorithm, but I just don't see how.


----------



## Lucas Garron (Oct 15, 2013)

Stefan said:


> You switched 11+15 to 15+11.


Oops, lemme fix that.



Stefan said:


> The upper bound should be updated. If in for example the 16+x search you find an x=8 solution, you should stop using 26 and instead start using 24.
> 
> Also, at least in Cube Explorer we usually don't go all the way but tell it to stop once it has found a let's say 21 or 20 moves solution, as the last rounds are quite slow and such solutions are very good already.


I wanted to keep it simple, but on re-reading you're right... I should definitely mention it in that section.




Stefan said:


> Well, we *can* optimally solve F2L in one phase, we can even solve the *entire cube* optimally in one phase. It's got to have to do with it being used as phase 1 of such a two-phase algorithm, but I just don't see how.



Well, I'm describing a simple 2-phase algorithm Phase 1 is usually brute force search (with a lot tricks: transition table, pruning, meet-in-the-middle). In particular, you need to enumerate *all* F2L solutions of certain lengths, or at least a lot of them. You can optimize a few things specifically for F2L because there are so few LL positions, but most of that wouldn't help explain Kociemba's algorithm.


----------



## DrKorbin (Oct 15, 2013)

Lucas Garron said:


> Solvers for other puzzles usually use a similar approach, except they use more phases. This WCA scrambler for 4x4x4 uses 4 phases to generate a scramble for a random state (but it doesn't continue to find an optimal solution).



Could you name these 4 phases, please?


----------



## Lucas Garron (Oct 15, 2013)

DrKorbin said:


> Could you name these 4 phases, please?



Well, the "Algorithm" section in the readme explains that it uses steps from Tsai's 8-step algorithm, which gives us the following:


```
1. < R,L,F,B,U,D,r,l,f,b,u,d >
2. < R,L,F,B,U,D,r,l,f2,b2,u2,d2 >
3-4. < R2,L2,F,B,U,D,r2,l2,f2,b2,u2,d2 >
5-8. < R,L,F,B,U,D >
```


----------



## DrKorbin (Oct 15, 2013)

Lucas Garron said:


> the "Algorithm" section in the readme explains



Thanks!
How didn't I notice :fp


----------



## cuBerBruce (Oct 16, 2013)

Lucas Garron said:


> Well, the "Algorithm" section in the readme explains that it uses steps from Tsai's 8-step algorithm, which gives us the following:
> 
> 
> ```
> ...



But it really does 5-8 as 2 phases, right?


----------



## Stefan (Oct 16, 2013)

Lucas Garron said:


> Well, I'm describing a simple 2-phase algorithm Phase 1 is usually brute force search (with a lot tricks: transition table, pruning, meet-in-the-middle). In particular, you *need to enumerate *all* F2L solutions of certain lengths, or at least a lot of them*.



But in Kociemba, you also need to enumerate a lot of phase 1 solutions. Maybe not at the shortest phase 1 lengths, but then later when you're trying longer phase 1 lengths. And I don't see why that's easily possible but with F2L it's impossible. Let's say the whole given cube can be solved in x moves and you're looking for phase 1 solutions of length x, aren't then both versions pretty much even enumerating all whole cube solutions?

Btw, would we ever really _"Try all the solutions that take 25 moves for F2L"_, given that the whole cube can always be solved in 20 moves so at phase 1 length 20 we'd find a phase 2 solution of length 0 and stop?



Lucas Garron said:


> Try all the solutions that take 16 moves for F2L and *fewer than* 26 - 16 = 10 moves for LL.
> Try all the solutions that take 17 moves for F2L and *at most* 26 - 17 = 9 moves for LL.



You should stick to _"fewer than"_ (or subtract 1).
And you have _"1 + 8 = 25"_ and _"28 moves for F2L"_ and _"25 - 25 = 7"_ typos.


----------



## Dapianokid (Oct 16, 2013)

Thank you so much Lucas. I like math, so I expected these methods to be easy to grasp. When they weren't, I gave up in frustration. I love this basic introduction to group theory and cube math. It makes so much more sense when explained like that.


----------



## Stefan (Oct 16, 2013)

Dapianokid said:


> I love this basic introduction to group theory and cube math. It makes so much more sense when explained like that.



Where do you see an introduction to group theory?


----------



## Lucas Garron (Oct 16, 2013)

Stefan said:


> But in Kociemba, you also need to enumerate a lot of phase 1 solutions. Maybe not at the shortest phase 1 lengths, but then later when you're trying longer phase 1 lengths. And I don't see why that's easily possible but with F2L it's impossible. Let's say the whole given cube can be solved in x moves and you're looking for phase 1 solutions of length x, aren't then both versions pretty much even enumerating all whole cube solutions?


You can't start pruning the search aggressively until you've found a few solutions. Kociemba starts finding solutions with about 10 moves in phase one. I don't know God's number to solve F2L on average, but I'd guess it's significantly higher -- enough to make the initial BFS really impractical (at least, when "real" Kociemba is an option).



Stefan said:


> Btw, would we ever really _"Try all the solutions that take 25 moves for F2L"_, given that the whole cube can always be solved in 20 moves so at phase 1 length 20 we'd find a phase 2 solution of length 0 and stop?





Lucas Garron said:


> In fact, we can extend this and keep looking. Here's our plan:
> ...
> We abort the original 26-plan and continue our search wtih a new upper bound:
> ...
> For 3x3x3, we happen to know that the optimal solution must be at most 20 moves... but that's only because we know God's number. The approach works even if we don't know God's number.






Stefan said:


> You should stick to _"fewer than"_ (or subtract 1).
> And you have _"1 + 8 = 25"_ and _"28 moves for F2L"_ and _"25 - 25 = 7"_ typos.


Yeah, I do have a few consistency issues. I'll fix those.


----------



## Lucas Garron (Oct 16, 2013)

cuBerBruce said:


> But it really does 5-8 as 2 phases, right?



Yeah, it uses min2phase. ;-)
The name is technically "TPR-4x4x4-Solver", which is "Three-Phase-Reduction Solver + 3x3x3 Solver". From the perspective of the algorithm, though, the 3x3x3 stage is a 1-phase black box. You could call it a 5-phase solution overall, but I'm not sure if that's much more accurate.


----------



## cuBerBruce (Oct 17, 2013)

Lucas Garron said:


> The name is technically "TPR-4x4x4-Solver", which is "Three-Phase-Reduction Solver + 3x3x3 Solver". From the perspective of the algorithm, though, the 3x3x3 stage is a 1-phase black box. You could call it a 5-phase solution overall, but I'm not sure if that's much more accurate.



But if the pseudo 3x3x3 phase actually used a conventional single-phase solver, you would get slightly shorter solutions (gnerally speaking) and execution time could be quite lengthy (based on the performance of available 3x3x3 single-phase solvers). So even though the pseudo 3x3x3 phase may be viewed as a black box from a point of view of coding the 4x4x4 solver, as far as how the solver works internally, I think it's clearly a 5-phase solver. I would say it's a 5-phase solver where optimization is performed across the last two stages, but not across any of the other phases.

I suppose you can be more general about what you might consider a "phase" to consist of. Although the pseudo 3x3x3 portion of the solve is internally implemented as two phases, the fact that the 4x4x4 solver treats it as a black box means that the two Kociemba phases have to be treated as a single "phase" from the point of view of trying to do any optimizations across phases. So in that sense it may make some sense to talk about it as 4 phases. But the fact that the code treats the 3x3x3 phase as a black box doesn't really change the fact that a total of 5 phases are actually used in producing the solution (which is inverted to create the scramble).


----------

