Although (in theory) any randomly scrambled Rubik's Cube can be solved in 20-26 moves, current solutions are basically based on brute force searches. Given the massive number of combinations (4.3 × 10^19), one major challenge to training a system is the sparse reward mechanism – DeepMind's paper outlines the use of a new algorithm – "Autodidactic Iteration" – using 2,000,000 iterations over approximately 8 billion cubes in just 44 hours.
https://www.hpcwire.com/2018/07/25/new-deep-learning-algorithm-solves-rubiks-cube/
Solving (and attempting to solve) Rubik’s Cube has delighted millions of puzzle lovers since 1974 when the cube was invented by Hungarian sculptor and
