https://www.youtube.com/watch?v=1waHlpKiNyY&list=PLkDaE6sCZn6Hn0vK8co82zjQtt3T2Nkqc

  1. Cycle: Idea → Code → Experiment

Best choice of the hyper-parameters → Go around the trials several times

  1. How efficiently go around the cycle

  2. how well build and design the dataset(train, val, test)

    1. prev. era: split 70-30 train-test split or 60-20-20
    2. modern era: dev will be very small considering the size of big data (dev doesn’t have to be too big, just need to be able to evaluate the which algorithms does good work) also, in case of test it is also the similar case.
      1. so 1000,000 / 10000/ 10000 → 98 %/1% / 1% pr 99.5 % / 0.5% / 0.5%

    → if the data is small enough, the traditional way is okay.

    → Mismatched train/test distribution can return unbiased result → make sure dev and test come from same distribution

    Maybe Test set is not necessary - to avoid the unbiased the outcome

    : so use train and dev and iterate the cycle

  3. Bias and Variance

Untitled