That makes a lot of sense, basing the sample size on the running average. I’m surprised that the different rtp’s don’t make much difference for play criteria, but I’ve never actually tested it. Thank you!
Are there any holes in the analysis? Value of individual feature spin * number of spins built up / average number of main spins to hit said feature. Then add the decimals from each feature to the rtp from the main game to get the edge.