lichess.org
Donate

The Woodpecker Method

how many data points like the 3 book authors? have there been other documentable cases.... about the rating trajectories, including GM norms or not.

Lots of variables can explain, with such few sampling data points, and as the blog pointed out very well, why a given regimen worked, including the signature explanation, but not only.

The blog author theoretical counter-arguments would justify thinking of setting up experiments that could have confirmation or offer plausible alternative explanations, that would mean progress on chess learning methodologies (including the rating measures for tournament competitions and titles of course). And the concluding remarks do finally suggest an avenue for at least alternate explanations, if not systematic experimental design (as I might have been dreaming).
@doom12384 said in #10:
> Instead of doing the same puzzles over and over with increasing speed, as suggested with the Woodpecker Method, perhaps it would be better to do different puzzles that all follow the same theme with increasing speed. By doing different puzzles, you avoid the risk of overfitting, but you also have the ability to focus on a theme to train it more effectively.

The Woodpecker method already does that. It's not proposing to just to do one/a few puzzles, it's proposing to have a set of over a thousand puzzles.

Also generally: humans don't overfit the same way machines do. So far all available actual data we have on the Woodpecker Method actually supports it, and besides speculation and analogies we don't have anything to proof the downsides mentioned.
The overfitting argument ignores the main idea, building the initial pattern store versus later generalisation. You first have to burn the pattern into your mind, and this is hard enough. When a musician trains, they start with known pieces they practice over and over, gradually increasing speed. Once down, they can start to improvise and add voice. If you are learning mathematics, you first have to practice a skill consistenly, say turning points on a quadratic. Once down, you then extend outwards. Especially as I get older, I find getting the patterns as hard as the generalisation.
In Pump Up Your Rating, Axel Smith mentions that the tactical motifs have to be ingrained into the player's memory before starting Woodpecker (page 224 in the physical copy mentions knowing these tactical motifs first).

The Woodpecker Method is that it is not a beginner training exercise, and technically shouldn't be used to learn what the motifs are. It is not exactly the same as in de la Maza's book from what I remember, since the latter doesn't mention the initial preparations needed.

So before starting Woodpecker, intensive studies on what pins, skewers, etc. should occur. I would think this would cut down on any superficial training, since the player should already know the motif and figure out the way to exploit it.
@lorb said in #14:
> Also generally: humans don't overfit the same way machines do. So far all available actual data we have on the Woodpecker Method actually supports it, and besides speculation and analogies we don't have anything to proof the downsides mentioned.

All available data. not doubting that it might be true. but what is the extent of all available data.
A few book authors with apetizing titles, and how many books get sold, would not be a convincing argument.
The rationales about spaced or scheduled repetitions are themselves speculation and analogies, until proven.
So while in hypothesis land, stating alternative hypotehses, are also valid arguement to share.

The blog proposes a sound alternative, and possibly the opportunity to test it, not one author at a time.. The statistics would luckily be a nice side-effect of the group hypothesis as alternative motivator possible resulting in same diligence on hard work upon chess positions challenges.

I think i would agree that we need data. So where is that available data?

Last post about pre-existing learning, does point to more un controlled variables, not part of the woodpecker picth.
It also makes the dart throwing analogy possible more relevant. because the physical sports muscle memory that i think has been used in supporting analogy or perhaps even the method ideation, is not making explicit that the internal proprioception model that allow humans to throw things in some general direction with some effectiveness is usually learned in early childhood (i was said to have started early myself, ejecting food that had accumulated in my mouth, baby food not to my liking, i must have learned some basic baliistic then i bet, and locomotion trial and errors, rarely about repeating precise motions, more about trying in all direction of muscle tensions.. learning not to contract antagonist with agonist at the same time...

I agree that general arguments from other fields do not suffice. but that also applies to woodpecker.
music. we have innate musical perception or very early development of it, before even becoming musicians..

this is not about recognition more about execution precision and speed it seems. reproducible, and allowing to be part of a group so that hard thinking processes be about the whole coordination between musicians, as it can't be second natures what the other would do. or other conscious processes needed during performance.

bascially NN in machine are emulating basic visual cortex architecture, so I would say, that they might be model certain aspect of our ability to store and recognize pattern. In chess there is board feature pattern recognition, and mapping that to dynamic patterns.. The notions of generalization, is not an artefact of CPU implementation of the neural networks. it is a mathematical property of the basic cortex layer structure. It is about how a flexible function famiily can fit to data in parsimonious fashion and still extrapolate correctly to unseen new test input.

The functions do not have any prior knowledge about the reality to learn (exageration, we do have some connectivity that may be constraining or biased like all animals in their ecosystems, to some life relevant stimuli, gravity might have had some effect, maybe)...

As humans we can learn new things and adapt to new visual environment (think of the inverting goggles, if not a myth, that over the time scale of weeks, can be learned away and the brain would still make sense of where up and down is, in all it locomotion needs).

So, one the board feature recognition aspect alone, doing too much of one thing will keep adjusting the very flexible function familiy to the very details of the same position, even those that may not be relevant to the execution pattern being optimized for speed without any conscious interference.

So yes overfitting would apply in wet networks too.. we don't have mind control over our visual subconscious processes that actually implement such mathematical functions. So the only argument against over-fitting is that there are 1000 puzzle position challenges.. but we also need to know how different they span.
I dont know why this method is thaat questionable if, for example, there are a lot of books and lessons that tell you "x number of fundamental positions to learn" that are well respected. I remember GM Hikaru trying to remember one of the positions on those kind of books, because the puzzle or match he was playing had a lot of similarities and he had, even if not the certainty, the instinct to evaluate that position to be winning for him if played with accurracy. I think the only problem with this method is the kind of puzzles that youre picking, because at least for the fundamental or most common positions I would really trust this. Or for common tactics in my openings repertoire. Or endgames.
<Comment deleted by user>