# Most Efficient Way to Digitally Store CubeStates?



## Chree (Feb 10, 2016)

After a random chain of thoughts I had last night, the question just occured to me: If I wanted to store a digital representation of every single possible cube state, how much space would that take up? This naturally led to the question: what is the most efficient way to store any single cube state in a digital format? I don't think that I came up with "The" answer, but I knocked out a few ideas. I searched the forums for a better answer and found none. I hope y'all can help out.

My first instinct was to just look at a scramble picture like we get from TNoodle, which is just a picture of where the stickers wind up after the scramble. TNoodle spits out the familiar (and easy to read) layout. 6 faces with 9 stickers each in a "T" shape, 54 stickers total. I could just as easily imagine a 1x54 grid of colors, where each segment of the grid is assigned to a particular sticker on the cube. To store that digitally, I can assign each color a value, then create an array or string with the sticker colors encoded in sequence. Since there are 6 possible colors for each sticker, I would need a minimum of 3 bits to say what color was asigned to any one sticker.

*6 faces * 9 stickers per face * 3 bits per sticker = 162 bits per cube state*

Ok... but technically, I don't need to store the center stickers. I can just as easily write a decoder that assumes the 6 center stickers never change, and we can just ignore it. That removes 1 sticker per face that we need to store.

*6 faces * 8 stickers per face * 3 bits per sticker = 138 bits per cube state*

To cut it down further, I can make one big assumption: the cube is always solvable. For a solvable cube, if I know the location and values for 11 edges, we know that we can derive the location and orientation of the 12th edge. Same goes for when we know the state of 7 of 8 corners. In my decoder, I could program it to figure out the last pieces and calculate their orientation. This eliminates the need to store the values for 2 edge stickers and 3 corner stickers, 5 total. So...

*((6 faces * 8 stickers per face) - 5 stickers) * 3 bits per sticker = 123 bits per cube*


Then I decided to take a different approach. What if, instead, only want to code the properties of each piece? There are many fewer total Pieces than Stickers, so it'll probably be more efficient.

Instead we'll just store the permutation and orientation of each piece. As far as the "Array" is concerned, we can imagine the first value stored is the location of the WBO corner. The next value is for WBR, then WGO, etc. The order that you store the pieces is arbitrary, as long as your decoder knows which values are assigned to which pieces. Then we assign each location a value, so UBL is "0", UBR is "1", UFL is "2", etc. Once again, the actual value is arbitrary, we just need 1 unique value for each unique location. 

There are 8 corners with 8 possible permutations: 3 bits each.
Each corner has 3 possible orientations: 2 bits each.
12 edges with 12 possible locations: 4 bits each.
And each edge has 2 possible permutations: just 1 more bit each.

(Edit: Just realized that's really just 5 bits per piece)

So:

*8 corners * (3 bits per permutation + 2 bits per orientation) + 12 edges * (4 bits per permutation + 1 bit per orientation) = 100 bits per cube state*

Quite a jump. And like before, we can ignore the last edge and corner, and just program the decoder to calculate the permutation and orientation based on the state of the rest of the cube.

*7 corners * (3 bits per permutation + 2 bits per orientation) + 11 edges * (4 bits per permutation + 1 bit per orientation) = 90 bits per cube state*


So I'm down to 90 bits per cube. If I wanted to use this "code" to store every single possible cube state, such that I could program a decoder to pluck out any one of them and give me a visual representation of cube state... store all 4.3 quintillion states would cost 387 quintillion bits... or at 8 bits per bytes, 48ish quintillion bytes, so something like 48 million petabytes.

I haven't even bothered to do the exact calculation on this because I'm positive that there has to be a more efficient way of encoding a cube state. 

I'm sure SOMEone has thought about this before. Thoughts?


----------



## RicardoRix (Feb 10, 2016)

What about assigning a predetermined order, where each order means given cube state. So State 0 would be the solved state. Then you would only need to store the order number between 0 and 43 quintillion. However many bytes that is, 8 I think.


----------



## AlphaSheep (Feb 10, 2016)

Using Kociemba's coordinates I think you can get it down to one bit more than the limit (66 bits = ceil(log base 2 of 43 quintillion))...
EP is 0..479001599
CP is 0..40319
EO is 0..2047
CO is 0..2186

So EP+(CP*12!)+(EO*12!*8!)+(CO*12!*8!*2^11) will give an integer between 0 and 86 quintillion.


----------



## obelisk477 (Feb 10, 2016)

RicardoRix said:


> What about assigning a predetermined order, where each order means given cube state. So State 0 would be the solved state. Then you would only need to store the order number between 0 and 43 quintillion. However many bytes that is, 8 I think.



So you just use the devils algorithm (3x3 hamiltonian circuit), and define each cube as a certain distance into the algorithm. Maybe not quite what OP was looking for....

EDIT: I wonder if there's a program that actually does this in reverse. You submit a cube state, and it tells you how far along in the hamiltonian circuit that the cube occurs.


----------



## Chree (Feb 10, 2016)

RicardoRix said:


> What about assigning a predetermined order, where each order means given cube state. So State 0 would be the solved state. Then you would only need to store the order number between 0 and 43 quintillion. However many bytes that is, 8 I think.



That beats the 90 bit formula from above. What would the data actually look like?




obelisk477 said:


> So you just use the devils algorithm (3x3 hamiltonian circuit), and define each cube as a certain distance into the algorithm. Maybe not quite what OP was looking for....
> 
> EDIT: I wonder if there's a program that actually does this in reverse. You submit a cube state, and it tells you how far along in the hamiltonian circuit that the cube occurs.



As long as you can assign a small value to a location within the Hamiltonian, then yeah, I think that could work. Although, if the decoder was forced to traverse the Hamiltonian in order to spit out a particular cube state, it would be a bit unfeasible to actually recall any particular cube state. But I feel like I'm don't understand enough about the Hamiltonian to say that that's an actual issue.

As for your second question: that feels worthy of exploration, even if it's doesn't pertain to this thread. That'd be fun.


----------



## qqwref (Feb 10, 2016)

It's quite doable to just store a number from 0 to 43 quintillion, using something like what AlphaSheep posted (although you can also avoid the extra factor of 2 for parity permutation with some cleverness). Then the problem becomes how to quickly convert that into a more usable cube state. Sadly, it's just a bit bigger than 64 bits.

You mention, though, wanting to store a digital representation of every single possible cube state. But at this point, do you really need to? Your representation is just a list of the numbers 0, 1, 2, 3, ... up to 43 quintillion. So you could store nothing, and then if someone asks for the n'th position, you would just put n into your number-to-cube-position function, and not have to actually look anything up.


----------



## Chree (Feb 10, 2016)

That certainly negates any real need to store the Cube State, itself. Although, as you point out, it makes recalling any random position problematic.

So I could put a further constraint on the question: The Possibility of Human Recall: "Store the cubestate in such a way that a Human Being can look at the data and easily figure out the cube state".

I'm guessing it'd be sorta tough for a human to see the number 1,405,678,923,345 and be able to tell where the White-Green-Orange corner is positioned. But that's fairly easy to do with the methods I layed out above... and it's especially easy with the last method listed.

That said, you're right: going the Hamiltonian route could render this entire question moot.


----------



## cuBerBruce (Feb 10, 2016)

obelisk477 said:


> So you just use the devils algorithm (3x3 hamiltonian circuit), and define each cube as a certain distance into the algorithm. Maybe not quite what OP was looking for....
> 
> EDIT: I wonder if there's a program that actually does this in reverse. You submit a cube state, and it tells you how far along in the hamiltonian circuit that the cube occurs.



I've thought about trying to create a function that would take a cube state and return the index of the position within my Hamiltonian circuit, but have never gotten around to it. It would probably involve some degree of searching, and for that you would have to be using some other way of representing cube states. It would certainly not be as bad as simply iterating the Hamiltonian circuit 1 move at a time. For example, my Hamiltonian circuit has only 2728 places where EO changes. So by determining which cubies are oriented and which aren't, you can obviously restrict the search to only that section or sections where that EO occurs.

But the bottom line is that using an index into a Hamiltonian circuit would be an impractical method for representing cube states.


----------



## shadowslice e (Feb 10, 2016)

Could you just use BLD memo as a basis for a compression technique?

So for corners you could use the first 3 bits to represent the face you are talking about and the second 2 to indicate which sticker on that face eg 00101 may represent the R face (001) and the top right sticker (01). Then go through all the cycles for corners leaving twisted corners (I'll explain why later) and leaving a null at the end or something that could never come up such as 111 for the face.

Then you could do edges in the same way again using 5 bits each (again leaving flipped edges)

Now, to deal with flipped edges and twisted corners, we can use the fact that they will not have been included in the cycles. We can start from the UBL and number each corner 1-8. If the program can be written to work out which pieces are not affected in the cycles, we can represent all the orientations a corner can be in and proceed through these going UBL UBR DBL DBR etc (obviously only using the ones which don't come up in the cycles). This would take an additional 2 bits per cubie. Leave a null and the same can then be done for edges but they only need 1 bit each.

I would guess that this method would take about 20- bytes of data (or at least 160- bits) though the program decoding it would have to be fairly complex (though not too complex I don't think) and I also sort of dispense with the byte as the standard unit.

I think there would be about 4 nulls, 8 corner targets, 12 edges targets, 2 flipped/twisted pieces.

I know this could be a bit of a headache for a programmer but this is the best way i could think of for compression with efficiency so what does everyone else think?


----------



## RicardoRix (Feb 10, 2016)

Given there are 43 quintillion states to distinguish uniquely, then there is no way of compressing the data any smaller than that. I got the original 8 bytes calculation wrong, it's actually 9 bytes.



> "Store the cubestate in such a way that a Human Being can look at the data and easily figure out the cube state".


That's not what you originally asked for, but considering the above statement and to make the data 'readable' then you have to introduce some inefficiencies in the data storage, something like you originally described I would guess, or some slightly clever idea like AlphaSheep\shadowslice's suggestion or possible a compressed version of the 20 move scramble (20 moves * 4bits = 10bytes (would need 4 bits for 0-12 states for all the different moves R,R',B,B',etc.)). 

But the real question to this statement should be 'why do you want it readable straight from the DB or screen'? Whenever you need to do this, the program reads the value, then you write a function to translate the stored value to cube state description or an image or a scramble or whatever it is you consider human readable.


----------



## Chree (Feb 10, 2016)

RicardoRix said:


> But the real question to this statement should be 'why do you want it readable straight from the DB or screen'? Whenever you need to do this, the program reads the value, then you write a function to translate the stored value to cube state description or an image or a scramble or whatever it is you consider human readable.



Oh, it doesn't need to be that way at all. I was just trying to bounce us off the Hamiltonian path... which Bruce wound up doing better than I could. It was more 'food for thought' than an actual request. And it's not like this is a question for practical application or anything, anyway. I'm just curious to see what else people can come up with.

For instance, the 90bit cube state method I described could be readable by humans. But if there was a way to compress or reencode it further to take up less space, maybe then only a program could read it.

I rather liked shadowslice's BLD code idea. Although there are tons of cube states that would introduce cycle breaks, flipped/twisted pieces, or parity... there'd also be lots of states with solved pieces, too, and those would take up less space. That might warrant some exploration to see if the savings outweight the costs.


----------



## shadowslice e (Feb 11, 2016)

Well, I did some thinking and I think that the BLD suggestion can store any cube state in <15 bytes (each sticker takes 4 bits of memory to store);

This will only occur if you have the maximum number of 2 swaps and hence cycle breaks on a cube.

Number of corner targets=10 (40bits)
Null (4/44)
Number of edge targets=16 (64/108)
Null (4/112)
No twisted corners (0/112)
Null(may not be needed as the program could work out there are no twisted corners) (4/116)
No flipped edges (0/116)
Null (same as above) (4/120)

So the whole cube could be stored in 120/8 =15 bytes or perhaps less if the nulls are eliminated (so would be 14 bytes)

And again, this is the worse case that I can think of so I think this could be quite a good method of compression though the program may not be very nice to program. It's also quite nice because you don't need a large database for each cubestate.


----------



## SenorJuan (Feb 11, 2016)

I came up with a lower bit-count solution, just a modification of your edges & corners strategy:
7 corners x 3 bits (for the 8 positions) = 21 bits, deduce the 8th position by what the other 7 are.
Consider the orientation of the corners. For one face, eg the D face, there's 81 orientations (3x3x3x3), which could be defined with a 7-bit value. For the opposite U face, you only need to specify 3 corners orientation, so (3x3x3) = 27 possibilities = a 5-bit value. The final corner is deduced from what's already specified. So that's 21+7+5 = 33 bits for corners.
Edges can have two orientations, only 11 need specifying, so 11 bits are needed. Now for positions, it seems wasteful to use a 4 bit number for 12 cases. (12 x 12 x 12) can be represented with 11 bits. So this 'compression' can be used three times for the first 9 edges. The final 2 can be done with 8 bits (4 bits per, as in the simple case). Total = 11 + 11 + 11 + 8 = 41 bits. So edges total = 41 + 11 = 52 bits.

Edit: If you compressed 5 edge positions, you would need (12 x 12 x 12 x 12 x 12) states, which could be represented by an 18-bit number. So encoding as (5-edges + 3-edges + 3 edges) would use 18 + 11 + 11 = 40 bits. It would need a 256K x 20-bit lookup table, though. 

This totals 85 bits. (or 84 with the 5+3+3 edge position encoding)
It does need modest look-up tables to do the 'decompression'. Eg the 11-bit to 12-bit converter is a whopping 3 kilobytes. You may need a similar 'compression' table, for speed.


I can't help think the 'factorial' nature of specifying positions offers scope for bit-saving. Ie. once you've stated where the first corner is, there's then only 7 possible locations for the next one, and after that, only 6 possibilities for the 3rd corner etc etc.


----------



## Chree (Feb 11, 2016)

That's pretty awesome! I remember thinking that 4 bits for EP was wasteful and wondered how 'compression' of this sort could be implemented. Nice work!

And don't worry about the space taken up by a lookup table. You just saved over 2 trillion gigabytes of overall storage


----------



## mark49152 (Feb 12, 2016)

Can you get away with storing only 10 edge positions? Both the remaining 2 can be deduced by avoiding parity. Total would be reduced to 80 bits if grouping the edges as 5+5.


----------



## ch_ts (Feb 12, 2016)

I think that the most straightforward way is just encode EP, EO, CP, and CO separately then concatenate them all together:
EP from 0 to 12!-1, use 29 bits
EO from 0 to 2^11-1, use 11 bits
CP from 0 to 8!-1, use 16 bits
CO from 0 to 3^7-1, use 12 bits
Total of 68 bits.

For permutations, that would mean something like this (let's use 8 pieces as an example):
0,1,2,3,4,5,6,7 -> 0 -> 0000000000000000
0,1,2,3,4,5,7,6 -> 1 -> 0000000000000001
0,1,2,3,4,6,5,7 -> 2 -> 0000000000000010
0,1,2,3,4,6,7,5 -> 3
...
7,6,5,4,3,2,1,0 -> 40319 -> 1001110101111111

Edge orientation: use 0 if oriented, 1 if flipped:
something like 00100101110
(Deduce orientation of 12th edge from the other 11 so it's not necessary to include it)

For corners orientation: 0 if correct, 1 if twisted CW, 2 if twisted CCW:
something like 1002201
Get a base 3 number, convert to binary.
(Deduce orientation of 8th corner from the other 7 so it's not necessary to include it)


See this page of Jaap's for more information on indexing orientations and permutations:
http://www.jaapsch.net/puzzles/compindx.htm


----------



## Chree (Feb 12, 2016)

mark49152 said:


> Can you get away with storing only 10 edge positions? Both the remaining 2 can be deduced by avoiding parity. Total would be reduced to 80 bits if grouping the edges as 5+5.



Is this true? It seems true. Is that true? I mean... orientation is independent, and would still need to be stored... but permutation. Huh.



ch_ts said:


> I think that the most straightforward way is just encode EP, EO, CP, and CO separately then concatenate them all together:
> EP from 0 to 12!-1, use 29 bits
> EO from 0 to 2^11-1, use 11 bits
> CP from 0 to 8!-1, use 16 bits
> ...



*bows*


----------



## nvpendsey (Feb 12, 2016)

How's this for CP

1st corner has 8 positions so 3 bits 2nd corner has 7 positions so 3 bits
3rd corner has 6 positions so 3 bits 4th corner has 5 positions so 3 bits
5th corner has 4 positions so 2 bits 6th corner has 3 positions so 2 bits
7th corner has 2 positions so 1 bits 1st corner has 1 positions so 0 bits

Total 17 bits.

(This might be wrong but I don't know anything about how this works but just had this idea so I posted it)


----------



## shadowslice e (Feb 12, 2016)

ch_ts said:


> ...



That seems like the best shot overall. What does the database look like?

I suppose it would be relatively easy to write the code which could generate the numbers and work out what they mean though. Just wondering when this would become better than the BLD thing I posted above which averaged something like 10-12 bytes but seemed to have less in the database.

Also, I think we can. Now really see the power of collaboration reducing the initial total by over a factor of 10.


----------



## SenorJuan (Feb 12, 2016)

An alternative approach:
If you only need 20 moves to reach any state, and there are (6 x 3) = 18 possible ways to make a turn, then you would need theoretically 4.17 bits x 20 = 83.4 bits. 
Edit: see later post: only 81 bits are needed. 
This also brings up the fact that 20 moves is the max, you may need only 15, say, so less bits would be needed for these cases. Which makes me think that rather than the rigid '85 bits' or whatever, a flexible strategy where some states may only need 50 bits to describe them should give a lower average bit-count, assuming you were wanting to store lots of states.

I was also contemplating if you could somehow specify the position of (say) a corner in a relative way, rather than an absolute way, ie. after specifying the first corner, state what the next corner is from the remaining 7 choices, as clearly you can't have two pieces in the same location. In theory this may trim off 1 bit from corner positions, and 1 from edge positions?

An edge position 'test' run:
1st edge: 4 bits...wasteful
2nd/3rd edges: 11 x 10 =110 states, this can be stored as a 7-bit value.
4th/5th, 6th/7th, 8th/9th edges: as above.
10th edge: 4 bits..wasteful
11th edge: as Mark R. pointed out, not necessary to specify it.
Total 4 + 4 x 7 + 4 = 36 bits.
It's the same total as the [(5 edges + 5 edges) = 18 + 18 bits] result, but with tiny, rather than massive, look-up tables.

Something tells me the 1st edge and 10th edge should combine into a 7 bit value if you're able to specify _any_ of the last 3 edges as your 10th one, ie. 12 x 10 states, =120, = 7 bits. (this assumes that if you can specify 10 out of the 11 positions, and there are 3 edges still unspecified, then you're going to 'hit' one.)
Combining 1st edge, 2nd/3rd edge and 11th edge gives a total no. of states that can be represented with 14 bits. That would give a total of 14 + 3 x 7 = 35 bits. Modest size look-up table needed.


----------



## shadowslice e (Feb 12, 2016)

Sort of a side note; if you wanted to store the entire 43 quintillion cube states, you could further reduce that using all the rotations (both sorts) reflections and inverses to reduce by a factor of 2^3*2*24*6~2300 unless I for my calculations wrong somewhere if you add a certain tag at the front to show rotation/reflection which would be 10 bits long I think tough it may be possible to reduce that further.


----------



## Herbert Kociemba (Feb 12, 2016)

ch_ts said:


> I think that the most straightforward way is just encode EP, EO, CP, and CO separately then concatenate them all together:
> EP from 0 to 12!-1, use 29 bits
> EO from 0 to 2^11-1, use 11 bits
> CP from 0 to 8!-1, use 16 bits
> ...



Since the position of the last two edges is forced you can use a EP* from 0 to 12!/2 -1 by combining the index of the combination of 10 edges (Binomial(12,10) possibilities) with the 10! possible permutations of these 10 edges (Binomial(12,10)*10! = 12!/2). I used this kind of coordinate several times in my programs. This saves 1 bit. 
Of course you can now build the total Index for example with

3^7(8!(2^11 EP*+EO)+CP)+CO

and you get an unique index from 0...43252003274489856000-1 for each cube, which needs 66 Bits.


----------



## Herbert Kociemba (Feb 12, 2016)

shadowslice e said:


> Sort of a side note; if you wanted to store the entire 43 quintillion cube states, you could further reduce that using all the rotations (both sorts) reflections and inverses to reduce by a factor of 2^3*2*24*6~2300 unless I for my calculations wrong somewhere if you add a certain tag at the front to show rotation/reflection which would be 10 bits long I think tough it may be possible to reduce that further.



This will not help. Your tag bits will eat up your savings by the reduction. It is impossible to get below Log[2,43 quintillion] bits.
Btw. the reduction factor by symmetries and inversion is only 96, not 2300.


----------



## hamfaceman (Feb 12, 2016)

I think you guys may be overthinking this. How about a scramble sequence?


----------



## shadowslice e (Feb 12, 2016)

hamfaceman said:


> I think you guys may be overthinking this. How about a scramble sequence?



That has come up. It's less efficient than some of he other methods and would also mean you need to gen all of the algs foe all of the cases and store them in binary.


----------



## EMI (Feb 12, 2016)

The efficiency of your representation depends on what you want to do with it. The length of the shortest possible (binary) representation is obvious. (Btw, too bad that it is 66 (65?) bits, not 64...)
But to make calculations fast, you want to store a state as some bit string(s) in such a way, that simple bit operations made up from AND, OR, XOR etc. and some constants can represent single moves, as well as the "concatenation" of two different states maybe.
I have wondered before how those fast solving programs (Cube Explorer, ...) implement it, but I was too lazy to find out. ^^


----------



## RicardoRix (Feb 12, 2016)

shadowslice e said:


> That has come up. It's less efficient than some of he other methods and would also mean you need to gen all of the algs foe all of the cases and store them in binary.



That's not true. Whatever is deemed 'human readable' has not be defined, but if you encode a scramble then the 'decoder' method (and everything suggested so far would need some kind of decoder) would just give you a scramble, no need for any kind of 43q lookup.

hamfaceman maybe talking about storing the scramble sequence without any compression at all, and so then would not need any kind of decoder to be human readable, would need a few more bytes though, but even so would be an efficient solution for this scenario.
Also a coded scramble (19 different types of move (5 bits), 20 moves = 100 bits = 13 bytes) and very easily decoded, no lookup, the same is true for your BLD method, and also makes it more efficient than the 14 bytes you quoted earlier.


----------



## SenorJuan (Feb 12, 2016)

If you're describing a scramble sequence, then after the first turn (18 possibilities = 5 bits) you have less options for the subsequent move: just 5 faces & 3 rotation choices = a 4-bit number. Then a 20-turn sequence would be 5 + 19 x 4 = 81 bits long.


----------



## RicardoRix (Feb 12, 2016)

SenorJuan said:


> If you're describing a scramble sequence, then after the first turn (18 possibilities = 5 bits) you have less options for the subsequent move: just 5 faces & 3 rotation choices = a 4-bit number. Then a 20-turn sequence would be 5 + 19 x 4 = 81 bits long.



Fair enough, very clever. I was considering a 'no-move' for the 19th different type of move, but if you go to that extent I suppose the length of the data could deal with no-moves.


----------



## SenorJuan (Feb 12, 2016)

I guess you would choose the 'unused' 16th value to define a no-move, and hence end-of-scramble. And ditto, one of the many spare states for the first move could be used to represent no-move, ie. the cube is solved, without needing any turns.


----------



## rokicki (Feb 12, 2016)

Herbert Kociemba said:


> Since the position of the last two edges is forced you can use a EP* from 0 to 12!/2 -1 by combining the index of the combination of 10 edges (Binomial(12,10) possibilities) with the 10! possible permutations of these 10 edges (Binomial(12,10)*10! = 12!/2). I used this kind of coordinate several times in my programs. This saves 1 bit.
> Of course you can now build the total Index for example with
> 
> 3^7(8!(2^11 EP*+EO)+CP)+CO
> ...



I have to ask, why do we need a particularly concise representation? If you're not storing many positions, the
difference between a cubie representation (20 bytes, one per cubie, each holding a value of 0..23) and a 66-bit
index (which generally really takes 128 bits or 16 bytes due to alignment) is not a big deal.

Directly indexing positions has a certain elegance, but it's not too far from just lexicographically ordering a
working cubie representation.

The cubie representation permits operations such as moves, inversions, reorientations, and direct position
multiplication (pretty much the standard group operations). And it's pretty quick to turn into an index, or to
derive from an index.

So if the reason you want such a concise representation is to store many many positions, then you get into
ideas of how to efficiently store a finite subset of a large finite set, or conversely, how many distinct
elements you can store in a given amount of memory. While you can't do a whole lot better than 66 bits
per, you can do somewhat better, depending on the operations you need to perform and their required
relative speed.

For instance, to store 200M positions (say, known distance-20 positions) concisely, you
might just index them, sort the indices, and then break them out into about 5000 separate lists by the
leading 13 bits, reducing the bit count of each entry to about 53 bits (and paying a small amount of
overhead for the required datastructures to keep track of this information). If access speed is not important,
you can just encode the delta between adjacent entries in a big sorted list; this would probably be about 40
bits per position (using Huffman or arithmetic coding to encode the bit length of each entry).

But you have to ask yourself if saving 26 bits out of 66 or about 40% is worth all these heroics.


----------



## Bruce MacKenzie (Sep 4, 2017)

There are 4.33 x 10^19 3x3x3 cube positions. One may map each cube position onto the integers and uniquely represent a cube state in I think 65 bits. This is commonly done to index pruning tables based on cube subgroups using some kind of radix compression scheme. For example the configuration of the corner cubies may be represented by the number:

... 15 * (18 * (21 * {0 to 23} + {0 to 20}) + {0 to 17} ) + {0 to 14} ) ...

The first corner may in any of 24 states numbered 0 to 23, leaving the second corner any of 21 states, ect.


----------



## shadowslice e (Sep 4, 2017)

Bruce MacKenzie said:


> There are 4.33 x 10^19 3x3x3 cube positions. One may map each cube position onto the integers and uniquely represent a cube state in I think 65 bits. This is commonly done to index pruning tables based on cube subgroups using some kind of radix compression scheme. For example the configuration of the corner cubies may be represented by the number:
> 
> ... 15 * (18 * (21 * {0 to 23} + {0 to 20}) + {0 to 17} ) + {0 to 15} ) ...
> 
> The first corner may in any of 24 states numbered 0 to 23, leaving the second corner any of 21 states, ect.


This was discussed earlier in the thread but was ruled out as an answer to the problem as envisioned by the op pretty early on.


----------



## Bruce MacKenzie (Sep 4, 2017)

shadowslice e said:


> This was discussed earlier in the thread but was ruled out as an answer to the problem as envisioned by the op pretty early on.


The point being made is that a representation of a cube position requires a minimum of 65 bits. Any scheme requiring more than that is not as efficient in terms of space. Any scheme using less than that will not be able to represent all possible states.

Woops, I went to the calculator and found that it requires 66 bits, not 65.


----------



## shadowslice e (Sep 4, 2017)

Bruce MacKenzie said:


> The point being made is that a representation of a cube position requires a minimum of 65 bits. Any scheme requiring more than that is not as efficient in terms of space. Any scheme using less than that will not be able to represent all possible states.
> 
> Woops, I went to the calculator and found that it requires 66 bits, not 65.


I agree that it is not as efficient in terms of space though the op stated he wanted a somewhat "human-readable" way of storing the infomation rather than just assigning each state a number (though again that is the most efficient way to store cubestates).

A way of getting the mapping on the cubestates is by using the devil's algorithm and assigning each state a certain distance along the cycles.


----------



## Bruce MacKenzie (Sep 4, 2017)

shadowslice e said:


> I agree that it is not as efficient in terms of space though the op stated he wanted a somewhat "human-readable" way of storing the infomation rather than just assigning each state a number (though again that is the most efficient way to store cubestates).
> 
> A way of getting the mapping on the cubestates is by using the devil's algorithm and assigning each state a certain distance along the cycles.


A cube simulation will have a "usable" representation of the cube. Commonly, the simulation will have a representation which specifies the position and orientation of the eight corner cubies and the twelve edge cubies. The simulation has routines for graphically rendering the representation to the display. It can apply face turns to the representation, multiply two representations, etc.

Anyway, if it becomes necessary to store cube positions in a minimum of space the algorithms for mapping such a representation onto the integers 0 to 43,252,003,274,489,855,999 are straightforward.


----------



## Bruce MacKenzie (Sep 7, 2017)

rokicki said:


> 66-bit index (which generally really takes 128 bits or 16 bytes due to alignment) is not a big deal.


You're correct that if the configuration is saved in a structure such as below the structure uses 16 bytes.

struct configuration{ uint8 rump; uint64 radix; };

However, the data may be stored in 9 bytes if a 9 element array of uint8 is used instead. i.e

struct configuration{ uint8 bytes[9]; } foo ;

uint64 *radix;

radix = (uint64 *)foo.bytes;

permutationIndex = *radix;

orientationIndex = permutationIndex % MAX_ORIENTATION_INDEX;

permutationIndex = 4 * (permutationIndex / MAX_ORIENTATION_INDEX) + foo.bytes[8];


----------



## Bruce MacKenzie (Sep 9, 2017)

Chree said:


> So I could put a further constraint on the question: The Possibility of Human Recall: "Store the cubestate in such a way that a Human Being can look at the data and easily figure out the cube state".



The best human readable representation I've run across is that of Michael Reid who wrote one of the first optimal solvers 15 or 20 years ago. His represented the cube as a facelet permutation. The representation for a solved cube is:

UF UR UB UL DF DR DB DL FR FL BR BL UFR URB UBL ULF DRF DFL DLB DBR

and an example of a scrambled cube would be:

DR FL FD UB DL LB UL BD BR RF UR UF LFU BDL RFD URB LDF DBR LUB FRU

This specifies that the down_right cubie is in the up_front slot with the down facelet in the up position. The front_left cubie is in the up_right slot with the front facelet in the up position, and so forth. Stripped of spaces this rep uses 48 bytes.


----------



## CuberFles (Sep 21, 2017)

But 48 bytes is like 384 bits, not at all efficient :?


----------



## Bruce MacKenzie (Sep 21, 2017)

CuberFles said:


> But 48 bytes is like 384 bits, not at all efficient :?


Not efficient but it is human readable. Also, it may be directly used to represent the cube as a facelet permutation. One may then use permutation multiplication to model cube manipulations.

...........................1..............2.............3.............4
*Index.........12 34 56 78 90 12 34 56 78 90 12 34 567 890 123 456 789 012 345 678*
*Slot..........UF UR UB UL DF DR DB DL FR FL BR BL UFR URB UBL ULF DRF DFL DLB DBR*
*Configuration.DR FL FD UB DL LB UL BD BR RF UR UF LFU BDL RFD URB LDF DBR LUB FRU*

*Facelet Permutation (GAP Canonical Cycle Notation):*
*(1,11,24,2,12,23)(3,19,18,22,4,20,17,21)(5,10,16,13,7)(6,9,15,14,8)(25,35,29,43,33,37,42,48)(26,36,30,44,31,38,40,46)(27,34,28,45,32,39,41,47)*


----------

