This book is suitable to the interested layman who would like to challenge creating and analyzing big data in medicine without serious programming in Python. Since the examples described in this book use open source Python libraries, immediately, you can download examples program sources from my github site.
Understanding the detailed algorithms is not needed at all. However, in Python programming, you need the skill of modularity and abstraction respectively.
Deep learning frameworks including keras and pytorch are all based on open source libraries with modularity and abstraction. The frameworks enable us to easily build the target machine learning system instead of writing your target system in C programs.
This book presents state of the art ensemble methods or ensemble machine learning including “Random Forest”, “ExtraTrees”, “Gradient Boosting”, “Adaboost”, “Voting”, “Bagging”, “LightGBM”, “Deep Learning”, and “Stacking”.
The introduced source programs in this book are relatively short and human-readable in Python.
This book introduces how to compute accuracy, confusion matrix, precision, f1, specificity, and recall respectively in a classification problem.
Machine learning can deal with only numbers in dataset. A finite set of integer values in dataset is computed by classification algorithms while continuous values (typically real numbers) in data are computed by regression algorithms.
In machine learning, string values in given dataset must be converted to unique integers for machine learning where the conversion is called preprocessing.
You must understand how to prepare for the dataset and preprocess it for machine learning. Data preprocessing includes how to cope with missing data or how to replace null values in data. Data preprocessing plays a key role in machine learning. This book shows how to cope with imbalance dataset using imblearn library.
If you want to explain the result of machine learning and how the machine learning can reach the conclusion, this book informs you how to generate explainable decision trees or how to convert black-box into explainable box.
The first example (pima-indians-diabetes problem) using random forest ensemble algorithm is used to explain train_test_split function, and other important functions. The diabetes dataset includes data from 768 women with 9 parameters:
The second example is to diagnose skin cancer using image data. Skin cancer dataset was released by Harvard University.
Using HAM10000, this book deals with skin cancer classification problem where skin cancer images can be classified into seven skin cancers.