Theory and Practice

  1. What are the two assumptions we make in order to believe that learning from data is feasible?

  2. What do the letters P-A-C in the PAC-learning framework mean?

  3. How many hypotheses are there in the hypothesis class of boolean literals?

  4. How many training samples are needed to achieve a generalization error tolerance of 0.1 and confidence of 90% for a hypothesis class of 4 boolean literals?

  5. Define VC Dimension.

  6. What is the VC dimension of axis-aligned rectangles?

  7. What is the VC dimension of lines (or hyperplanes)?

  8. As a rule of thumb, how many training examples do you need to get decent generalization error for an infinite hypothesis class?

  9. Explain the approximation-generalization tradeoff.

  10. Explain inductive bias.

  11. Explain the bias-variance decomposition.

  12. What is overfitting?

  13. How do you recognize when overfitting has occured?

  14. What is the primary cause of overfitting?

  15. Which models are more likely to overfit?

  16. What is regularization?

  17. What is validatation?

  18. What is cross-validation?

  19. What is the most important use of validation?

  20. What is Occam’s Razor?

  21. What is sampling bias?

  22. What is data snooping?

Turn-in Procedure

Write up your homework in a format that can be converted to a PDF file (like LaTeX) and name the PDF file theory-practice.pdf. Submit your theory-practice.pdf file on Canvas as an attachment. When you’re ready, double-check that you have submitted and not just saved a draft.

Verify the Success of Your Submission to Canvas

Practice safe submission! Verify that your HW files were truly submitted correctly, the upload was successful, and that your program runs with no syntax or runtime errors. It is solely your responsibility to turn in your homework and practice this safe submission safeguard.