Diagnosis of Asthma Disease and The Levels using Forward Chaining and Certainty Factor

Asthma disease is a major global health issue that affects at least 300 million people worldwide. Even for clinicians working in emergency rooms, predicting the severity of asthma is difficult. Predicting the intensity of an asthma attack is much more challenging because it is dependent on a number of factors, including the person's illness's features and severity. Forward Chaining and Certainty Factor algorithms can be implemented to diagnose the degree of asthma control, so the consultation process through the system becomes more detailed. The expert system can be used as an initial reference for the diagnosis process. Forward Chaining algorithm is useful for reasoning, starting from a fact to a solution. On the other hand, Certainty Factor algorithm is used to provide a level of confidence from the conclusions by generating from Forward Chaining algorithm. The research implemented several phase as follow analysis, data preparation, modeling, and evaluation. On evaluation, this research conduct three stages and tested using 80 medical record data. The result of the study has produced an expert system and generated an accuracy level of 65%, the precision value of 58.3%, and recall also produced of 57.13%. Therefore, Chaining and Certainty Factor performs reasonably well in the diagnosis of asthma disease.


Introduction
Asthma is one of the most common diseases that affected at least 300 million people worldwide and causes significant disability.Despite significant advancements in asthma therapy, but the death from this disease is still very high.In addition, a million of people are affected asthma in every year.In 2025, the global asthma population is expected to surpass 400 million [1].According to the results of the Household Health Survey Indonesia, asthma accounts for 5.6% of Indonesia's fourth leading cause of death.It should be noted that the number of asthma cases in Indonesia is 13 per 1,000 [2].
It is very difficult to predict the severity of asthma, even by doctors working in the emergency department [3].Predicting the intensity of an asthma attack is much more challenging because it is dependent on a number of factors, including the person's illness's features and severity.There are many attempts to predict asthma attacks, such as through telemedicine and wearable systems, but few have succeeded in predicting them due to the lack of reliable follow-up data [4].Uncontrolled asthma can lead to reduced community productivity and quality of life, as well as higher medical costs, hospitalization risk, and even mortality [5].Based on the previous explanation, research in detecting asthma is considered important, so that patients determine the level of asthma and can anticipate and overcome the disease.
In this research, Forward Chaining and Certainty Factor algorithms are used to diagnose the asthma level from the symptoms.The symptoms has got from the expert judgments, so that the facts can be used in diagnosing asthma level by using Forward Chaining.In general, the Forward Chaining algorithm is used to execute arguments based on the facts of the solution [6].While the Certainty Factor algorithm is used to provide a level of confidence from the conclusions generated by the Forward Chaining algorithm [7].
Several studies related to the Expert System in the diagnosis of asthma were carried out by Oktavia and Decky Pranala [8].The research have proposed Designing an Expert System for Diagnosing Asthma Disease with the Solution Using the Forward Chaining Method [9].Research conducted can diagnose asthma based on existing symptom data.However, the drawback of this research is that it does not produce quality accuracy from the results of the diagnosis so that it cannot be known what the level of performance of the system.Alfatah A. et al. have proposed the Decision Tree and Dempster Shafer for Diagnosing Patients with Lung Disease.The study used the data from medical record data with a total of 65 cases.From these data, a total of 54 cases werdiagnosa e in accordance with the doctor's diagnosis with an accuracy value of 83.08%.Niki Ratama [10] has done in the research with Comparing Certainty Factor and Decision Tree for Asthma Diagnosis Application Systems.The accuracy rate of the Certainty Factor is 100 percent, whereas the accuracy rate of the Decision Tree is 50 percent, with the number derived from case study data and usercompleted surveys.Widians and Hidayati [11] have used Certainty Factor for Diagnosing Asthma for Children.The research resulted in a system that can detect asthma in children, as well as the level of confidence in the diagnosis' conclusions, which is expressed as a percentage.However, the system testing process is only carried out by comparing the system performance with the results of manual calculations from an expert.The Generic Algorithm is implemented in the Asthma Diagnosing Expert System by Ardi Wijaya and Rozali Toyib [12].The approach taken by this algorithm is to combine randomly various choice of best solution in a collection to get the next generation of the next best solution that is in a condition that maximizes its compatibility or commonly called fitness.From the results of the system test obtained a very interesting answer 47%, attract 45%, and not draw 8%.The genetic algorithm's time calculating procedure is less efficient.
There are many studies related to the diagnosis of asthma, but the purpose of this research is to develop an Expert System to determine the level of asthma control.Besides that, the research is expected to help doctors in identifying the level of asthma that can be used for initial treatment of asthma.Forward Chaining and Certainty Factor have been used to determine and categorize the level of asthma.

Research Methods
The study focused on determining the level of asthma control taken form Global Initiative for Asthma (GINA).The level of asthma control is taken from five criteria for symptoms that are usually suffered by patients including the intensity of asthma relapses, sleep disturbances, use of reliever drugs, and activity limitations [13].To find out the general symptoms of asthma, the research started with interviewing experts and also direct observations at the Public Health Center, Bandung.
In this research, the data are getting from patients' medical records.There are 80 data that indicate the symptoms of the patients.The data is used as a knowledge base in determining asthma levels.The data categories into three asthma levels.These asthma levels include totally controlled, partially controlled, and not controlled.Forward Chaining and Certainty Factor is implemented for diagnosing asthma level.First, the research compared the result from the Forward Chaining and Certainty Factor algorithm with the result from the expert.To get the value of accuracy, precision and recall, the results are mapped to the confusion matrix.In this research, the accuracy, precision and recall are required to see how effective the Forward Chaining and Certainty Factor in predicting the asthma level [14].

Methodology of Proposed System
In this study, Forward Chaining and Certainty Factor algorithms were used in diagnosing asthma.The Forward Chaining Algorithm is used to find conclusions while the Certainty Factor Algorithm is used to determine the level of confidence from the resulting diagnosis.In addition, this study will also examine asthma and its degrees, along with how to handle it [15].
The fact that this study is in the system testing stage, where the stages are divided into three parts, gives it an advantage over the previous studies.The first step is to compare the results of the system's calculations with the outcomes of manual computations.Then the next step is to test the system on the results of medical records contained in an agency, and finally the Confusion Matrix method will be used so that the sensitivity, accuracy, and precision values can be calculated.It aims to ensure that these parameters can be used as a reference whether the system can work well or not [16].

Forward Chaining
Forward chaining is a search technique that begins by closing records forward to reach a goal.In expert systems, this algorithm is built into the inference engine component because of its usefulness [17], [18].In a forward chaining, a rule can also have multiple conditions linked by AND, OR, or a mixture of both, as follows:

Certainty Factor
The Certainty Factor (CF) method was developed with the uncertainty of an expert knowledge.This method is an expert's approach to the problem at hand, as experts often use the terms "probable," "most likely," and "almost" to analyze the information available.This method also used to explain self-confidence [19].
In this study, we interviewed experts to assign Measure of Belief (MB) and Measure of Disbelief (MD) scores for each symptom that identifies a particular level of asthma control.Then the MB value is decremented by the MD value to get the expert CF value [20].: A measure of the disbelieve level of Hypothesis H when influenced by evidence E (between 0 and 1) In fact, when a user chooses a symptom when consulting, the user chooses the confidence level of each symptom (see Table 1).Next, multiply that value by the expert CF to get the sequential CF.
If the CF symptom score is greater than 2, we can use the formula to find the CFcombine to combine each symptoms value.
The final one is to multiply the CF combination to get the percent confidence level from the user's asthma control.

Confusion Matrix
Confusion matrix is a tool that has a function to analyze or evaluate the performance of a model.The confusion matrix is in the form of an N x N matrix which is used to evaluate the performance of the classification model, where N is the number of target classes so that it can be applied to binary classification as well as to multiclass classification problems.is a form of confusion matrix in general [14].By using the confusion matrix, we can calculate the level of accuracy, precision, and recall.

1) Accuracy
Accuracy is the degree of similarity between the predicted and actual value.Accuracy is the ratio of correct predictions (positive and negative) to the overall data available.

2) Precision
Precision refers to the correspondence between the information sought by the user and the system's response.The ratio of real positive predictions to overall positive predictive results is known as precision.
Recall is the success rate of the system in retrieving an information.Recall is the proportion of correctly predicted positive data to all correctly predicted positive data.

Knowledge Base
The severity of asthma relapses, sleep problems, use of reliever medicines, and activity limitations are among the five symptom criteria used to define a patient's level of asthma control.Based on the interview with an expert, a rule has made to identify a certain level of asthma control.The representation of the rules can be used in the form of action, namely the IF pair of conditions (premises) occurs THEN (conclusion or conclusion) [6][21], [22].The rules formed will be used as a knowledge base that will be implemented in the system, so that the system can diagnose the level of asthma control.The following is a knowledge base in diagnosing the level of asthma control which can be seen in table 1.Based on the table 3, the various symptoms become the basis for making rules for building decision tree which is depicted in Figure 1.The notation from G1 to G15 indicate the symptomps that have done to make categorizations for the diseases.The notation (G1-G15) is used to help identify the symptoms produced in the treatment of asthma which are categorized into three categories (P1-P3).This category includes not controllable, partially controlled and totally controllable.From this categories, it can be used for doctor to decide initial treatment of asthma patients.The rules and decision tree has been illustrated in figure 3.
Figure 1.Decision Tree

System Architecture
The system can be used by two actors, namely the user and also an expert who both use the user interface to be able to use the system.If an expert wants to input the system then he can use knowledge acquisition facility, where the expert can create, update, and delete data.Furthermore, if it has been given to the facility, a rule will be made in the knowledge base which is a reference when the system makes a diagnosis.See the Figure 2.
Figure2.System Architecture When the user performs a consultation through the system, each CF value for each symptom that has been entered by the user will be stored in the working memory.Furthermore, from these results will be done with the rules that have been made by an expert in the previous section based on knowledge.In addition, in this section there is a calculation process based on the existing parameters.The calculation and calculation process occurs in the inference engine section so that a conclusion is obtained, or in other words the Forward Chaining and Certainty Factor algorithms will work in this section.Then, the results will be sent to an explanation facility where the user will be able to see the results of the consultation.

Flowchart
This flowchart illustrates how the user consults with the system.In practice, the user must input the CF value for each symptom in order to press the diagnosis button.
After that, the system will calculate using the forward chaining and certainty factor algorithms to get the result.Then, the results of the consultation will be showed and can be printed in pdf form [23].

Results and Discussions
This experiment was carried out using the medical record data from Public Health Centre.In this research, 80 medical record data have used in this research.The medical record data contains the symptoms of the patient and the results of the diagnosis of asthma control levels.But, in this study patient data will be kept confidentially, so the data is collected on by one by copying the data.Besides that, an expert knowledge has captured to get information about asthma level and the treatment of asthma.

Algorithm Testing
The existing medical record data is then tested into the system that has been created [24].However, due to the use of the Certainty Factor algorithm, a patient's level of confidence in the symptoms he feels is needed which is not contained in the medical record data and also does not allow him to test the system at the research site due to a pandemic.
Therefore, because the Certainty Factor algorithm test only compares the results of manual calculations with the results obtained by the system, each symptom felt by the patient will be given a CF value of 0.8 (Almost Certainly), and symptoms that are not selected by the patient are given a value of 0.2 (Unknown).The following are the results of system testing on medical record data which can be seen in the table 4. Based on the results from table 4, it is found that the Forward Chaining and Certainty Factor algorithms that are implemented into the system can match the number of 52 data that are suitable, while 28 data are not suitable.However, for comparisons between calculations performed manually and those generated by the system, it is evident that all calculations are in full compliance.

Confusion Matrix Testing
Based on the previously acquired data shown in Table 5, you can pour from that data into the confusion matrix format [25].In this case, all the data in the class is grouped according to the result of the system prediction and the actual data is grouped.In the form of a confusion matrix as shown in the table 5. Based on the data obtained from the table above, it can be seen that the form of the confusion matrix is in the form of 3x3 however, the accuracy, precision, and recall (sensitivity) values can be calculated as follows.

1) Accuracy
As we know that accuracy is the ratio of correct predictions (positive and negative) to the overall data available.So the confusion matrix can be calculated as is follows.
Because the form of the confusion matrix is 3x3, then we must first calculate the precision value of each existing class, then add up and calculate the average.The formula for calculating precision is found in equation ( 6), so the it will look like the following table 6.

3) Recall
To calculate recall or sensitivity values, it is the same as calculating precision, which starts from calculating the precision value of each class, then adds up and calculates the average.The formula for calculating precision is found in equation ( 8), so that the form is as shown in the table 7.As it is known that the expert system built is a rulebased expert system where the knowledge base is clearly obtained from an interview with an expert at the Babakan Sari Health Centre and indeed the number of classes determined is three, namely Uncontrolled, Partially Controlled, and Totally Controlled so that it can be adjusted.with actual data and predictive data from the system and make the calculation process appropriate.
Based on the results of the analysis, data that did not match were found more in the results of experts who stated the level of Partly Controlled.When viewed based on the MB and MD values of each symptom indicating the Partially Controlled level, these two values can be said to be smaller than the MB and MD values contained in the symptoms that indicate the Totally Controlled and Uncontrolled levels, so that this is what affects the results of the diagnosis generated by the system, because the calculation process for the diagnosis results by the system will take the largest value from each existing level.
In addition, another thing that affects the results is to give the CF user value of 0.8 (sure) for each symptom felt by the patient and give the CF user value of 0.2 (don't know) for each symptom that is not selected by the patient in the medical record data.This can make the calculation non-specific because the patient should enter the user's CF value for each symptom according to the level of confidence he feels, or in other words the patient should directly consult using the system, so that this affects the diagnosis results.

Conclusion
Forward Chaining Algorithm requires knowledge base data that is implemented in the form of a rule which will then be matched with the data entered by the user.When conducting a consultation, the user must enter the user's CF value for each symptom so that the Certainty Factor algorithm can work so that it can produce a level of confidence from the diagnosis results.
MB and MD values are different for each level and the user CF value is not directly supplied by the user when conducting a consultation.The accuracy value of the two algorithms is 65%, the precision value is 58.3%, and the recall or sensitivity value is 57.13%.Therefore, Forward Chaining and Certainty Factor has a pretty good performance in diagnosing Asthma Disease.
Further research, it is better to test directly on patients so that the user's CF value becomes clearer for each symptom.The symptom data and knowledge base can be updated in the future if new symptoms are found that identify the level of asthma.

Table 2 .
Confusion Matrix Number of data with has negative actual value and has positive predicted value.FN : Number of data with has positive actual value and has negative predicted value.
Where: TP : Number of data with has positive actual value and has positive predictive value.TN : Number of data with has negative actual value and has negative predictive value.FP :

Table 3
DOI: https://doi.org/10.29207/resti.v6i5.4123Creative Commons Attribution 4.0 International License (CC BY 4.0) 764 AND You have never taken medication to relieve your asthma AND You often avoid triggers for your asthma flare-ups (dust, cold, animals) THEN Totally Controlled