# Speedcubing statistics: preliminary report



## badmephisto (Jun 12, 2009)

I got a hold of the entire WCA database that holds every single time that all people got in all competitions. Over the last two days I looked at a couple of trends that came to my mind for Only 3x3 speedcubing, and the results can be seen below:

http://tinyurl.com/nsml8s

Thank you Stefan, Lucas, and especially Patrick for help on getting this database downloaded and parsed!

What do you guys think? Is there anything you found interesting? Surprising? Are there some other things you would be interested in seeing? Any ideas on other graphs or stats? This is a very rough preliminary report. I just want to get some feedback from the community, listen to what you guys think, add some charts, make the whole thing look much nicer, and then publish the whole report and maybe even make a video on the results after. Should be interesting.


----------



## waffle=ijm (Jun 12, 2009)

I enjoyed the graph on "Effects of not practicing (somewhat)" I like the way that it's presented


----------



## Lucas Garron (Jun 12, 2009)

Some preliminary suggestions:

Log-plot of times, so we can see progress near the bottom ends.
http://www.worldcubeassociation.org/results/statistics.php#10 contradicts your data. Did you count IDs?


----------



## rahulkadukar (Jun 12, 2009)

Hey can I have a copy of the database.


----------



## shelley (Jun 12, 2009)

Interesting statistics!



> Tyson Mao messed up and went from 16.92125 to 37.6933333333 in 28 days <--------------------------lol



That was partly my fault. Though if he had asked me (instead of randomly grabbing a cube out of my bag) I might have been willing to lend him a better cube.


----------



## badmephisto (Jun 12, 2009)

Lucas Garron said:


> Some preliminary suggestions:
> 
> Log-plot of times, so we can see progress near the bottom ends.
> http://www.worldcubeassociation.org/results/statistics.php#10 contradicts your data. Did you count IDs?



what contradicts what? I calculated the averages a LITTLE different. I take ALL times in a day, discard all DNF's, then chop off biggest and lowest time to get final times. I take average of that remaining list to get the total average for that day for that cuber.
I discard DNF's because I think they are just a punishment for screwing up, but have much more to do with bad luck than actual ability.


----------



## rahulkadukar (Jun 12, 2009)

Lars Vandenbergh went from a time of 46.71 to 2:58.49 in 3x3x3 OH single in a span of 20 days. Between Danish and German Open 2009 lol


----------



## qqwref (Jun 12, 2009)

Very very neat. Can we see:
- Fastest improvement in a short time? (Both times should be 3x3 PBs, so we don't just count people who did really badly for some reason.)
- A graph of PB singles vs. PB averages? Who's got the best and worst single:average ratio? What was the best and worst single:average ratio in any given non-DNF-average round?
- Most times someone has improved their 3x3 PB?
- A list of the most number of times someone has beaten a person (Erik, say) at any event?
- Most DNFs at 3x3? Most solves?

And how are you getting competitions with so many people when the WR is 214 people (WC07)?


----------



## jcuber (Jun 12, 2009)

LOL at the "retard zone" on the "effects of not practicing" graph .

EDIT: this is the first time I have seen 3 people post at the same time.


----------



## Roux-er (Jun 12, 2009)

Very Awesome!!! I love the retard zone!


----------



## Lucas Garron (Jun 12, 2009)

badmephisto said:


> Lucas Garron said:
> 
> 
> > Some preliminary suggestions:
> ...


Your competitor counts. There should only be 1 competition over 150, and about 5 at 150? Your counts go up to 250.


badmephisto said:


> I calculated the averages a LITTLE different. I take ALL times in a day, discard all DNF's, then chop off biggest and lowest time to get final times.


Just a clarification: with "day" you mean "competition," right?


----------



## badmephisto (Jun 12, 2009)

qqwref said:


> Very very neat. Can we see:
> - Fastest improvement in a short time? (Both times should be 3x3 PBs, so we don't just count people who did really badly for some reason.)
> - A graph of PB singles vs. PB averages? Who's got the best and worst single:average ratio? What was the best and worst single:average ratio in any given non-DNF-average round?
> - Most times someone has improved their 3x3 PB?
> ...



some great suggestions!
and @ number of people at comps, I dont know! 
I think its because there were two competitions on that day then? I dont actually do it per competition, but per day. So if there were 2 comps that same day, the numbers would get added. I just realized that, oops.

I'll have to look into it when I come home, I'm going partying. Ohh and if i get drunk I'll have to look at it tomorrow  Thanks for suggestions!


----------



## miniGOINGS (Jun 12, 2009)

Roux-er said:


> Very Awesome!!! I love the retard zone!



umm, excuse me for asking, but where might this retard zone be?


----------



## jcuber (Jun 12, 2009)

miniGOINGS said:


> Roux-er said:
> 
> 
> > Very Awesome!!! I love the retard zone!
> ...



In the graph about the effects of not practicing.


----------



## miniGOINGS (Jun 12, 2009)

jcuber said:


> miniGOINGS said:
> 
> 
> > Roux-er said:
> ...



...and that would be...where?


----------



## waffle=ijm (Jun 12, 2009)

miniGOINGS said:


> jcuber said:
> 
> 
> > miniGOINGS said:
> ...



second to last graph


----------



## Lucas Garron (Jun 12, 2009)

I'm going to keep you busy:

Statistics ideas


----------



## waffle=ijm (Jun 12, 2009)

Lucas Garron said:


> I'm going to keep you busy:
> 
> Statistics ideas



I like these statistic ideas 
I'd actually like to see the statistics for "Most number of accented letters in name"


----------



## Sa967St (Jun 12, 2009)

> Some people go to every single competition together...
> 
> Rebecca Hughey and Marie Hughey went to all 9 competitions together.
> Sarah Strong and Emma Moseley went to all 7 competitions together.
> ...


lol @ the first one

Doesn't this mean that Jai, Peter and Ryan went to all 6 of their competitions together?


----------



## PatrickJameson (Jun 12, 2009)

All 13.37 results: http://pastebin.com/f3ca972a2
*People with most xx:xx.00 times: http://pastebin.com/f3dededce

I'll be doing some more.
*Excluded all times above 10 minutes as they are often rounded(as requested by qqwref)


----------



## cmhardw (Jun 12, 2009)

BLD statistics like:
overall accuracy rate for 3x3x3 BLD, 4x4x4 BLD, 5x5x5 BLD, multi BLD old, Multi-BLD

Also some kind of comparison of accuracy rate to single solve times. I am guessing that these are inversely proportional up to a point (i.e. those with high accuracies will tend to be slower than those with low accuracies). Of course you have to factor in beginners having slower times and lower accuracies, but I wonder if some sort of trend will come to light if you factor in all the data.

Chris


----------



## PatrickJameson (Jun 12, 2009)

All 13.37 results: http://pastebin.com/f3ca972a2
People with the most xx:xx.00 times: http://pastebin.com/f3dededce
Most common xx:xx.00 times: http://pastebin.com/f30e6f1e0
Most common times(see post below for the top few): http://pastebin.com/f43fd7a8

The 'Most common times' one is interesting. I'll do some more tomorrow.


----------



## qqwref (Jun 12, 2009)

PatrickJameson said:


> Most common times: http://pastebin.com/f43fd7a8



For those who don't wanna download like 2MB of text, here are the top:
16.96: 114
16.71: 111
15.84: 109
18.31: 107
17.96: 105
15.03: 105
16.90: 104
17.46: 104
16.18: 104
18.09: 104
19.34: 103
19.53: 103
18.40: 103
15.09: 102
1.78: 102
17.84: 102
16.59: 101
16.11: 101
17.72: 101
18.90: 101
15.83: 100
15.56: 100
17.28: 100
19.03: 100
17.53: 100


----------



## PatrickJameson (Jun 12, 2009)

Ok this is REALLY weird. These are the most common .xx's(First is sorted by most common second sorted by .xx's)

http://pastebin.com/fa1a2584
http://pastebin.com/f6bae0f30


----------



## Dene (Jun 12, 2009)

PatrickJameson said:


> All 13.37 results: http://pastebin.com/f3ca972a2
> People with the most xx:xx.00 times: http://pastebin.com/f3dededce
> Most common xx:xx.00 times: http://pastebin.com/f30e6f1e0
> Most common times(see post below for the top few): http://pastebin.com/f43fd7a8
> ...



Oh man, Philip has 2 13.37 times. He must be super l33t.
It's funny how magic (I assume?) makes an appearance in the most common times list. It rather stands out 

Oh and Lucas: You have (had) way too much time on your hands :/


----------



## cmhardw (Jun 12, 2009)

PatrickJameson said:


> Ok this is REALLY weird. These are the most common .xx's(First is sorted by most common second sorted by .xx's)
> 
> http://pastebin.com/fa1a2584
> http://pastebin.com/f6bae0f30



I think that proves that the Stackmat is a flawed timer :-( Or am I interpretting this incorrectly?

Chris


----------



## DavidWoner (Jun 12, 2009)

I would like to see who has the highest winning percentage for each event, where they have competed in it at least 5 times. Maybe longest winning streak for each event as well.


----------



## masterofthebass (Jun 12, 2009)

Vault312 said:


> I would like to see who has the highest winning percentage for each event, where they have competed in it at least 5 times. Maybe longest winning streak for each event as well.



I think I may get it for 5x5. Just maybe.


nvm... erik has 23 streak for minx and 21 streak for 5x5. I didn't know he had some many comps since WC07


----------



## pjk (Jun 12, 2009)

cmhardw said:


> PatrickJameson said:
> 
> 
> > Ok this is REALLY weird. These are the most common .xx's(First is sorted by most common second sorted by .xx's)
> ...


I agree, the stackmat timer is flawed according to those results.


----------



## vrumanuk (Jun 12, 2009)

*?*



pjk said:


> cmhardw said:
> 
> 
> > PatrickJameson said:
> ...



I don't follow. Maybe I haven't looked at the results long enough, what am I missing? Forgive my scant analyzation of the figures, its late.


----------



## Mike Hughey (Jun 12, 2009)

vrumanuk said:


> pjk said:
> 
> 
> > cmhardw said:
> ...



A significant percentage of the times are FAR less likely than the others. And there are two common levels - one around 3000 and the other around 500. I suspect most of us have always suspected this - I know I have. It's a very interesting finding.

I'm guessing this probably means that the internal mechanism for measuring time doesn't do so in seconds, but in some partial second increment, and the rounding to seconds leads to this kind of result. Just my guess as to the explanation.


----------



## blah (Jun 12, 2009)

Mike Hughey said:


> I'm guessing this probably means that the internal mechanism for measuring time doesn't do so in seconds, but in some partial second increment, and the rounding to seconds leads to this kind of result.



Exactly what I thought.


----------



## qqwref (Jun 12, 2009)

Well, there are 60 "common times", so... 60 frames per second?


----------



## xXdaveXsuperstarXx (Jun 12, 2009)

Have you ever noticed that the lowest time a stack mat will let you get is 0.16 Or is that just the first version stack mat.


----------



## clement (Jun 12, 2009)

Mike Hughey said:


> A significant percentage of the times are FAR less likely than the others. And there are two common levels - one around 3000 and the other around 500. I suspect most of us have always suspected this - I know I have. It's a very interesting finding.
> 
> I'm guessing this probably means that the internal mechanism for measuring time doesn't do so in seconds, but in some partial second increment, and the rounding to seconds leads to this kind of result. Just my guess as to the explanation.



There is a periodicity of 25. It would be interesting to run a spectral analysis. (or is it professional deformation ?)

Stefan, are you having fun changing at least 3 times your post ?


----------



## Stefan (Jun 12, 2009)

xXdaveXsuperstarXx said:


> Have you ever noticed that the lowest time a stack mat will let you get is 0.16 mili-sec.


Your stackmat can measure 0.00016 seconds.



clement said:


> Stefan, are you having fun changing at least 3 times your post ?


No. Sorry. I just didn't feel happy with the first versions.


----------



## shelley (Jun 12, 2009)

xXdaveXsuperstarXx said:


> Have you ever noticed that the lowest time a stack mat will let you get is 0.16 mili-sec. Or is that just the first version stack mat.



I don't know what timing device you're using, but it certainly isn't a stackmat.


----------



## vrumanuk (Jun 12, 2009)

Mike Hughey said:


> vrumanuk said:
> 
> 
> > pjk said:
> ...



Alright, that makes sense. I just noticed the huge drop-off from 2853 to 900.


----------



## Mike Hughey (Jun 12, 2009)

Mike Hughey said:


> I'm guessing this probably means that the internal mechanism for measuring time doesn't do so in seconds, but in some partial second increment, and the rounding to seconds leads to this kind of result. Just my guess as to the explanation.


I just realized this was incredibly stupid. I meant hundredths of a second, not seconds. But it's funny, everyone seems to have understood what I meant. Thank you.


----------



## PatrickJameson (Jun 12, 2009)

Ok so I just figured out something quite awesome(or horrible, depending on your view).

If we take the 'Most common .xx's' data(http://pastebin.com/f6bae0f30) we can see that all of the times are either >2582 or <901. This is interesting alone. But if we take a look on the number of times these numbers appear in a row, we find that it forms a perfect pattern. This is what we get;

112121211111212122
112121211111212122
112121211111212122
112121211111212122


----------



## Dene (Jun 12, 2009)

Whoa that's crazy. How did you even notice that?


----------



## pjk (Jun 12, 2009)

PatrickJameson said:


> Ok so I just figured out something quite awesome(or horrible, depending on your view).
> 
> If we take the 'Most common .xx's' data(http://pastebin.com/f6bae0f30) we can see that all of the times are either >2582 or <901. This is interesting alone. But if we take a look on the number of times these numbers appear in a row, we find that it forms a perfect pattern. This is what we get;
> 
> ...


That would make sense that a pattern would exist because of the previous data you sorted. There is obviously something in the timer causing this.

I contacted Bob Fox about it and he said that their lead engineer would get back with me shortly. I will let you know what I find out.


----------



## qqwref (Jun 12, 2009)

By the way, given the numbers, I estimate that about 1/3 to 1/4 of the stackmats used do NOT have this problem. I think it must be that some versions (the most common ones) of the timer have this weird distribution, while others don't.


----------



## StachuK1992 (Jun 12, 2009)

I have an idea to quicken this process.

Does anyone have access to a server that people could upload a bunch of times?

I'm thinking that we could get a bunch of people to do an easy-ish puzzle (2x2/3x3) over and over, and have each person possibly upload their times to said database, then work from those numbers.


----------



## spdcbr (Jun 12, 2009)

lol for the retard zone.


----------



## PatrickJameson (Jun 12, 2009)

Stachuk1992 said:


> I have an idea to quicken this process.
> 
> Does anyone have access to a server that people could upload a bunch of times?
> 
> I'm thinking that we could get a bunch of people to do an easy-ish puzzle (2x2/3x3) over and over, and have each person possibly upload their times to said database, then work from those numbers.



Almost a quarter million times are not good enough?


----------

