lichess.org
Donate

Science of Chess: A g-factor for chess? A psychometric scale for playing ability

@Graque said in #20:
> I didn't read the papers, so sorry if this is an elementary question, but it looks like the Amsterdam Chess Test was composed intuitively at first (i.e. the authors just threw in what they thought might be related). So there's guarantee that this is the best possible test, or even a particularly good one right?
>
> Looking at the correlations, it seems the tactics-related components were much more correlated to Elo. Is it possible that any validity is due to the choose-a-move/tactics components? My intuition (based on nothing) is that the recall, verbal knowledge, and motivation pieces aren't useful.
>
> If it basically comes down to tactics, then perhaps any tactics trainer (like puzzle storm) would be as good as the ACT.

It's a very good question and you're absolutely right. This is something I always emphasize to my students: Scale development is hard and you always end up starting by making guesses about stuff that seems reasonable to include. That always means there is a good chance you added in elements that aren't especially useful or missed other things that would have been apt.

The authors definitely end up saying that the choose-a-move and predict-a-move subscales are doing most of the heavy lifting. Those other tasks do end up covering some unique variance in their model fits (which is good) but it's not as much. Again, always a challenge - how many tasks (and which ones) do you add to try and account for as much variability as you can?

It would definitely be neat to take data from lichess to see how things like puzzle storm or other easy-to-administer tests do compared to the ACT. This is a case where there is so much data and so many users, I feel like you could tinker with things very quickly.
This article is as much scientific as my grandma's Facebook status updates.
Great stuff, one comment I have is that funny how we started from "Elo is bad in terms of psychometric properties, let's instead think of a different measure"... and then we evaluate the measure in terms of it's correlation to Elo :D
It is not about doing a scientific article, it is about integration things from science (for once or a rare occasion, lifting head above the water that seems to keep our head down a lot, knowing what we are doing for ages, why look above, ever?).

Chess itself could benefit from some integration of its exploded parts.... but that would not make anyone a better player than its neighbor.

It might be scientific about something else than individual improvement technology (as in, do this and in 30 days, your hair will grow faster than the others).
@FrugoFruit90 said in #24:
> Great stuff, one comment I have is that funny how we started from "Elo is bad in terms of psychometric properties, let's instead think of a different measure"... and then we evaluate the measure in terms of it's correlation to Elo :D

I hear you - that's why I mentioned there was a risk of circular reasoning when we got to that step! Elo isn't necessarily bad in terms of psychometric properties (though they're tough to characterize because of how it's measured), but it does lack any kind of subscale structure that helps you reason about different processes that contribute to ability. Elo is *good* for a lot of things, not the least of which is the ongoing refinement of the estimate. So yeah, it may seem a little funny to develop something new and then correlate it with the old thing, but it's because the old thing probably does capture something very well that you want to be sure the new thing can account for.

Thanks for reading!
@andreagobbez said in #22:
> Well, that's interesting but maybe not so accesible to everyone.

My apologies if I missed the mark a little in making it understandable. I'd be interested to hear if there were specific parts you thought weren't explained clearly enough. Thanks for reading!
@olokololokooko said in #23:
> This article is as much scientific as my grandma's Facebook status updates.

Hm - this depends heavily on what your Grandma posts to facebook, so I don't know how to interpret this one right away. If you have suggestions to make these posts better, I'm always glad to hear them. Thanks for reading!
@NDpatzer said in #27:
> My apologies if I missed the mark a little in making it understandable. I'd be interested to hear if there were specific parts you thought weren't explained clearly enough. Thanks for reading!
Well, i'm into Data Science too so the article was very interesting, thanks!
I just thought that most people just want to see the result with something as simple as possible.
Elo is widely implemented. You do one game and you have a score.
This new version is a bit too complicated for everyday people, even if may be more precise.

Anyway, thank you for clearing the fact that elo right now is just a comparison of two chances of winning or "not winning" between two people and not just a number of strength! :)
Factor analysis is mostly fake. In this particular case, your 4 factors depend heavily on (a) the arbitrary choice of deciding to fit 4 factors (rather than 3 or 5), and (b) the rotation strategy (arbitrarily chosen to be "promax"). You would get vastly different factors if you changed one of these arbitrary parameters. The same thing happens regularly in psychometrics; factor analysis should be viewed with extreme skepticism.