Introduction

In this project, you will apply two algorithms to a contact lens data set. Answer each question in the order they appear. Do not skip to later steps to answer earlier questions that ask you to predict outcomes based on your analysis of the data and understanding of the algorithms.

The Data Set

UCI lenses data set

The Algorithms

The Report

Write a report containing your responses to the following:

The Contact Lens Problem

Are the features of the data numerical or categorical?

A supervised learning problem is typically either a regression problem or a classification problem. Which kind of problem is the contact lens problem described in the data set?

Which kind of classification problem is the contact lens problem: binary or muilti-class?

How may binary classification problems can be derived from the data set?

For each binary classifcation problem you can derive from the data set, what is the minimum performance baseline?

Decision Trees on the Contact Lens Data

Which attribute do you expect to be chosen as the split attribute at the root node?

Run a decision tree classifier on the data and report the results in a confusion matrix.

Extract a rule set from your decision tree.

Boosting

Run boosted decision trees on the data set.

How did the boosted decision tree compare to the non-boosted decision tree?

Tips and Considerations

Turn-in Procedure

Submit your lenses.pdf file on Canvas as an attachment. When you’re ready, double-check that you have submitted and not just saved a draft.

Verify the Success of Your Submission to Canvas

Practice safe submission! Verify that your HW files were truly submitted correctly, the upload was successful, and that your program runs with no syntax or runtime errors. It is solely your responsibility to turn in your homework and practice this safe submission safeguard.