A supermarket chain has installed fingerprint readers at all of its locations. Your company has been hired by the supermarket chain to implement a machine learning system that recognizes participants in their membership program by their fingerprints. They have provided you with a sample data set of 100 training instances labeled with 1s for people in the program and 0s for people who aren’t in the program. A scatter plot of the data looks like this:

Fingerprint Scatter Plot

The CIA has also hired your company to implement a machine learning system that identifies people who should be granted access based on their fingerprints. The CIA has provided you with a data set and, amazingly, it is identical to the supermarket’s data set!

As a data scientist in the company you have been tasked with creating machine learning models for both customers.

Part 1: Risk Matrix

Naturally, the CIA and supermarket chain have different tolerances for false positives and false negatives which happens to correspond exactly to the values in Example 1.1 on Page 29 of Learning from Data. Do Problem 3.16 in Learning from Data.


Part 2: Logistic Regression Classification

  1. Train a binary logistic regression classifier on the data set.

  2. Test your classifier using a classification threshold of 0.5.

  3. Test your classifier using the probability thresholds you derived in Part 1 for the supermarket.

  4. Test your classifier using the probability thresholds you derived in Part 1 for the CIA.



Write a report containing your answer to Part 1 and the test results for the three scenarios in Part 2. This report should be in PDF format. You may find this template helpful (compiled PDF).

Your test results for Part 2 should include basic accuracy, confusion matrices and ROC curves. Include a brief discussion of your results. Don’t go overboard – you only need a few sentences to discuss the important points.

Turn-in Procedure

Submit your fingerprints.pdf file on Canvas as an attachment. When you’re ready, double-check that you have submitted and not just saved a draft.

Verify the Success of Your Submission to Canvas

Practice safe submission! Verify that your HW files were truly submitted correctly, the upload was successful, and that your program runs with no syntax or runtime errors. It is solely your responsibility to turn in your homework and practice this safe submission safeguard.