Assignment: Diagnostic tool (logistic regression)

Skeletons — Image: public domain, via Wikimedia Commons

Your task is to make a diagnostic tool (not for real medical use) that asks a medical expert six numerical quantities obtained by radiographic measurements of a patient:

pelvic incidence
pelvic tilt
lumbar lordosis angle
sacral slope
pelvic radius
grade of spondylolisthesis

As an output, your program should provide a probability estimate of the patient having each type of a vertebral abnormality (disk hernia or spondylolisthesis). In the training data, this information is stored in the class variable.

H = disk hernia
SL = spondylolisthesis
NO = normal

Task: Load and preprocess data

Load the data set.

Vertebral column data set
Material	Link	Reference
Data set	csv	Guilherme de Alencar Barreto (guilherme '@' deti.ufc.br) & Ajalmar R. da Rocha Neto (ajalmar '@' ifce.edu.br), Department of Teleinformatics Engineering, Federal University of Ceará, Fortaleza, Ceará, Brazil. Downloadable via Dua, D. and Graff, C. (2019). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.

There’s some preprocessing required: be sure to read the data description and label the variables accordingly.

Task: Build a logistic regression model

Build a logistic regression model that predicts the condition of a patient.

Task: Validate

Task: Obtain an accuracy estimate of the classifier using split or cross validation.

Task: Build a diagnostic tool

Build a diagnostic tool that asks the user for the six measurements that make the explanatory variable values. Then, print a suggested diagnosis (prediction) for the patient.

Tips: You can use Python input function for reading user data. E.g. to read the value of pelvic incidence into a variable called incidence, simply write incidence = input("Enter pelvic incidence")

Back to main page