https://www.youtube.com/watch?v=1waHlpKiNyY&list=PLkDaE6sCZn6Hn0vK8co82zjQtt3T2Nkqc
Best choice of the hyper-parameters → Go around the trials several times
How efficiently go around the cycle
how well build and design the dataset(train, val, test)
→ if the data is small enough, the traditional way is okay.
→ Mismatched train/test distribution can return unbiased result → make sure dev and test come from same distribution
Maybe Test set is not necessary - to avoid the unbiased the outcome
: so use train and dev and iterate the cycle
Bias and Variance