CLASSIFICATION OF SCOUT SKILLS USING NAIVE BAYES ALGORITHM

In the development of skills and skills of students Scout often finds the problem is the difficulty of developing the skills of learners caused by mistakes in determining the skills dominated by learners. How you can do it to solve that problem is by making a class determination or classification of areas of expertise controlled by the learner. This study aims to build a system to determine the inner class Scout skills area using Naïve Bayes algorithm for Scout Coach to determine the area of expertise of learners. So Scout Coach can develop the skills of learners in accordance with the areas of expertise that are owned optimally. Assessment criteria used are the values of General Knowledge, Scout Knowledge, Sign Language (Password, Morse, Semaphore), Node Bond, Pioneering, and Hasta Karya. All assessment criteria are used numerically. The resulting classification is the students who are included in the class Intelligence (Intelligence), Physical (Strength), or Creativity. The results of this study is a program that can perform calculations to determine skill class scout field. The Scouting Skill Classification Program using the Naïve Bayes Algorithm has been tested and its accuracy or accuracy is 100%.


INTRODUCTION
There are several skills taught in Scouting organizations.These skills can be applied in social life.The skills are tested in the Scouting competition.Skill members' development skills often experience difficulties.This is due to errors in determining the skills controlled by learners.Computer programs can be used to solve the problem.The problem formulation is how to design a system to classify Skill skills using Naïve Bayes algorithm.The data used comes from 5 junior high schools in Tumpang District and surrounding areas.Each data consists of general Knowledge, Scouting Knowledge, Sign Language (Password, Morse, Semaphore), Knot Ties, Pioneering, and Hasta Karya.The results of the classification are classes of Intelligence, Physical, or Creativity.The purpose of this study is to facilitate Scout Coach and Coach to determine which areas of expertise are mastered by students.

METHODOLOGY 2.1 System Analysis
This stage aims to explain the problems of the system, analyzing the needs of the system in the process of classification areas of Scouting skills.

Problem Analysis
Scouting activities have become part of the Indonesian government's program to improve the quality of education in Indonesia.One way to measure the ability of Scout members is to follow the race.Participants of the competition are Scout members from educational institutions.The Scouting Competition includes classes of expertise that can be classified into classes of intelligence, physics, and creativity.Using the data mining system, specifically the Naïve Bayes classification, a program was developed to classify areas of Scouting expertise.

Data Analysis
Data used in this research is data from 2015-2016 from 5 junior high schools in Tumpang District, Malang Regency, and surrounding areas.Each data consists of General Knowledge Scores, Scouting Knowledge, Sign Language (Password, Morse, Semaphore), Knot Ties, Pioneering, and Hasta Karya, as well as the original class of participants.There are 712 data available, consisting of 447 training data and 265 test data.

System Design
The designed system consists of 2 main parts, namely the master data process and the process of calculating the Naïve Bayes algorithm.

Master Data Process
The master data process contains data to be used as training data and test data.In the data master process there is a function to add data, change data, and delete data.The input data required by the system is participant data for training and participant data for testing.

Calculation Process of Naïve Bayes Algorithm
Naïve Bayes model development has several stages.The steps are described in the following figure.All features are numeric, so the feature probability calculation is not included in the training process.But the feature probability is calculated in the testing process by including the mean and variance of the training process.The entire process of training and testing is included in the implementation process.Once the highest end probability value is known, the system will display the classification results with the Naïve Bayes algorithm.

Example Calculation Classification Skills Scouting Field Using Naive Bayes
To perform the process of calculating Naïve Bayes classification, the training data and test data are required by algorithm.There are some data to be used as training data.From the sample train data, the mean and variance values are calculated.The mean of each feature is calculated based on belonging in the class where the data is.For example, there are 3 data in the class of intelligence.That is, only the values of those 3 data are calculated to find the mean of each feature for the class of intelligence.The mean of the training data can be seen in the following table.The next step is to calculate the variant and final variant of the training data, following example calculation of variant and final variant for PU matlom Intelligence class: The value of the variant and the final variant can be seen in the following table.After the mean, varian, and final variants are obtained, the next step is to calculate the probability value of features for numerical features.For that required test data.Here is the data for testing.The Naive Bayes algorithm computes the data one by one to classify it by its class through several stages in it.So one data for testing is enough to test the calculation of Naïve Bayes.All features are numerical, here is a formula for calculating the probability of numerical features: Here is the probability value of the numerical feature of the participant named Kharisma Pramiswara which is used as the test data.The next step is to calculate the probability of each class.This calculation requires the probability value of numerical features of the participants of Kharisma Pramiswara.This calculation is done by multiplying all probability values of numerical features in the same class.Here is the formula of calculating the probability value of each class.The final stage of Naïve Bayes algorithm calculation is to compare the value of each final probability that has been obtained before.The data will be classified based on the final probability value tertunggi.The highest probability end value of the test data is the class of intelligence.That is, participants of Kharisma Pramiswara including having a field of Scouting skills in the class of intelligence.

Data Flow Diagram (DFD)
Data flow diagram is a method of development of a structured data system.DFD describes all activities in the system clearly.

Figure 2. Data Flow Diagram
Based on Figure 2, the Scout Coach is the system administrator, and the Scout instructor is a system user.Scout coaches have full access to the system.Scout master can manage training data and test data, but it can also access the classification calculation process.While Scout instructors can only access test data and classification results.

Entity Relation Diagram (ERD)
ERD is a major data modeling that helps organize data in projects into entities and determines relationships between entities.ERD is also a model to explain the relationship between data in the database based on objects that have relationships between entities.In this case, the ERD design explains the relationship between attributes used in Naïve Bayes algorithm calculations.

RESULTS
This research produces a program that can classify areas of expertise controlled by Scout members.The program is built based on Java programming language.The program interface is shown in figure 4. In the Master Data process, the administrator can manage the data used in the system.By clicking on the name of the data, the administrator can change the details of the data.

Figure 5. Results Classification in Naïve Bayes Calculation Process
To perform calculations, the first time the user must set the train data.Users simply type in the amount of train data, then click the "Data Set" button.Then click the "Data Set" button on the right side of the test data amount column.Then click "Next" button and follow the steps.In the last panel, the user can see the result of classification.Users can choose whether or not to save their classification results.Unsaved classification results, can not be seen in the classification report page.

System Testing
The testing phase is done to find the accuracy value of system calculation using Naïve Bayes algorithm.Because the calculation of the participant's son and daughter is done separately, the test of the system is performed twice for each calculation.As can be seen in table 8, with some of the training data used, the system calculation accuracy has reached 100% for both genders.Similarly, testing using all available trainer data, test results show that there are no test data that are classified as different from the original class.This means that the program's accuracy reaches 100%.The average value of the calculation accuracy is 100%, and the error rate is 0%.

Analysis of Class Effects of Train Data on Classification Results
From system testing, it is known that its accuracy reaches 100%.Furthermore, analysis of the tests will be carried out by including incorrectly classified data into the training data.There are 60 incorrect data for each gender.The class of data has been changed into the wrong class.The following is the result of the test.As shown in table 9, there are some data that are classified differently from the original class.For testing on the male participants, there are 12 different results.The accuracy value decreased to 90.4%, and the error rate increased to 9.6%.While the test of female participants resulted in 6 different results.Accuracy decreased to 95.7% while the error rate increased to 4.3%.This suggests that the determination of the training data class has an effect on the final classification results using the Naïve Bayes algorithm.

CONCLUSION
The Naïve Bayes algorithm has a high degree of accuracy in the Scout classification skill classification, which achieves 100% accuracy when using all available trainer data.The available train data are 207 data for son and 240 data for daughter.Based on the results of the tests using accurate data for training, as well as incorporating inappropriate data for the training, it can be seen that variations in the training data class may affect the classification results using the Naïve Bayes algorithm.Data class determination when adding new data for training can affect the variation of class probabilities in the calculation, thus affecting the final classification results using the Naïve Bayes algorithm.Data for training is still taken randomly by the system, for further research it is advisable to develop a program with a function to select data for training.For further development, it is hoped that the program will be built based on web browser for wider access by users.

P 3 ) 3 =
(Class) = P(PU | Class) × P(PK | Class) × P(SMS | Class) × P(SI | Class × P(PIO | Class) × P(HK | Class) (Here are the calculations for the class of intelligence: P (INTELLIGENCE) = 0.0018 × 0.0453 × 0.0363 × 0.057 × 0.0453 × 0.0975 = 0.0000000007394Here is a calculation for the physical class: P (PHYSICAL) = 0.0111 × 0.0024 × 0.0226 × 0.0019 × 0.0018 × 0.0122 = 0.00000000000002649Here are the calculations for the creativity class: P (CREATIVITY) = 0.0781 × 0.0422 × 0.0325 × 0.0594 × 0.0946 × 3.718e-07 = 0.0000000000002237 After the probability value of each class is found, then done the final probability calculation of the test data.To calculate the final probability value, the probability value of each class is obtained from the class probability data from the training data.Here is the calculation.Calculation of the final probability value of the class of intelligence: Calculation of final grade probability value of creativity: P (KREATIFITAS) = 0.0000000000002237 × 0.

Figure 3 .
Figure 3. Entity Relation Diagram of the System

Figure 4 .
Figure 4. Master Data Interface of the Program

Table 1 .
Example of Train Data

Table 2 .
Mean Of Train Data

Table 3 .
Variant Values From Examples of Train Data

Table 4 .
The End Variance Of Example Train Data

Table 5 .
Examples of Test Data

Table 6 .
Probability of Numerical Features of Participants Kharisma PramiswaraAfter the probability of numerical features is obtained, then calculate the probability value of the class.The probability of a class is calculated by dividing the amount of data in each class by the sum of all training data.Here is the class probability value.

Table 8 .
Test Result Analysis

Table 9 .
Analysis of Class Effects of Train Data on Classification Results