No . 1 , June 2017 191 IMPLEMENTATION OF NAIVE BAYES METHOD IN CLASSIFICATION OF BREAST CANCER DISEASE

Less knowledge of early symptoms of breast cancer and how to deal with it early and the number of specialist doctors who are still limited is one factor contributors because of the increasing number of people affected by breast cancer disease. The development of breast cancer disease classification system aims to predict the early diagnosis of breast cancer disease in users or patients into two categories of malignant or benign. The initial diagnoses of this system prediction variable include Clump Thickness, Uniformity of Cell Size, Uniformity of Cell Shape, Marginal Adhesion, Single Epithelial Cell Size (Single Epithelial Cell) Size), Bare nuclei, Bland Chromatin, Normal nucleoli, Mitosis Using the naive bayes method to process diagnostic data in patients, the results of this system test show that the system is able to predict and classify breast cancer disease into two categories (malignant or benign) with the amount of data testing of 500 data. With the output of malignant or YA and benign or NO, the system is able to predict with an accuracy value of 98%.


INTRODUCTION
Breast cancer or often referred to as Breast Cancer is a malignant tumor derived from cells found in the breast.The breasts consist of lobules, ducts, fat and connective tissues, blood vessels and lymphs.In general, Breast Cancer comes from cells in the ducts, some of which come from the lobules and other tissues.Statistically the risk of Breast Cancer is increased in women nullipara, early menarche, late menopause and in women who have first child pregnancy over the age of 30 years.Less than 1% of breast cancer occurs at the age of less than 25 years, after age over 39 years incident increases rapidly, the highest incidence is found at the age of 45-50 years.Various studies have been done related to Breast Cancer case, among others, research using binary logistic regression method in breast cancer case by Darsyah, 2013 on breast cancer based on mammography result using SVM, the accuracy of classification obtained from SVM model is 99%.Sivaramakrisma's research, et al 2000 compared the performance of mammographic improvement algorithms.For microcalsification, adaptive neighborhood contrast improvement algorithm is best by 49%.

RESEARCH METHODS
This study uses Naive Bayes Algorithm.Naive Bayes algorithm is one of the algorithms found in the classification technique.Naive Bayes is a classification with probability and statistic methods.The Naive Bayes algorithm is also one of the classification algorithms that is easy to implement and has fast processing.

ANALYSIS AND DESIGN SYSTEM
As for several stages in the classification of breast cancer with naive bayes algorithm is to enter training data to perform the calculation of each class.There are 9 variables that will be diagnosed to know the type of breast cancer in patients, namely Clump Thickness, Uniformity of cell size, Uniformity of cell shape, Marginal Adhesion, Single Epithelial cell size, Bare nuclei, Bland chromatin, Normal nucleoli, Mitoses and there are 2 classes The outputs to be identified are Malignant / Dangerous and Benign.Predicted patients with unknown class of breast cancer who have variables include Clump Thickness 1, Uniformity of cell size 3, Uniformity of cell shape 2, Marginal Adhesion Next calculate the number of features and probabilities in each class, where the categorical data is calculated based on how much the same amount of data in features within a class then divided by the number of classes.
Gaussian Distribution Formula: The Gaussian distribution formula is used to calculate the probability of data in numerical form.

RESULT AND DISCUSSION
System testing is a test in entering data into the form -the form provided.At this stage the test is done by using 140 training data on the system.Based on the test results of 140 training data obtained results that there are 137 data in accordance with the actual class.The calculation process is done by using the formula: Precision : Recall :

CONCLUSION
From the study concluded that this system can help the classification of breast cancer type based on the results of variable examination in patients affected by breast cancer disease by using the method of naive bayes in the process of classification.Based on the result of interconnection system test and functional testing of the program, it can be concluded that the program is feasible to be overall and has diagnostic result with accuracy level on 96% precision malignant, 99% precision benign, malignant recall 98% and recall benign equal to 98%.

Table 2 .
Probability of each featureNext calculate the initial probability of multiplying the probability value of each feature in each class, while to calculate the final probability of class calculation times the initial probability.

Table 3 .
Preliminary and Final Probability ResultsFinally compare the result of each probability class that is looking for the biggest value between Malignant or Benign class, because the biggest value is in Malignant class then its output is "Malignant".

Table 4 .
System Test Results

Table 5 .
Performance Assessment System