Ant Colony Optimization Modelling for Task Allocation in Multi-Agent System for Multi-Target

Task allocation in multi-agent system can be defined as a problem of allocating a number of agents to the task. One of the problems in task allocation is to optimize the allocation of heterogeneous agents when there are multiple tasks which require several capabilities. To solve that problem, this research aims to modify the Ant Colony Optimization (ACO) algorithm so that the algorithm can be employed for solving task allocation problems with multiple tasks. In this research, we optimize the performance of the algorithm by minimizing the task completion cost as well as the number of overlapping agents. We also maximize the overall system capabilities in order to increase efficiency. Simulation results show that the modified ACO algorithm has significantly decreased overall task completion cost as well as the overlapping agents factor compared to the benchmark algorithm


Introduction
Multi-agent systems have been recently used in various fields due to their superiority in completing complex tasks compared to single-agent systems [1]- [4]. One of the problems in multi-agent systems is task allocation, i.e. the problem of allocating a group of agents in completing task to achieve the system's goal [2], [5]- [10]. Some real-world task allocation problems in multi-agent systems include the coordination and planning problems of multi-robot deployment in production process [11], coordination problems of several drones [4], [12]- [14], and multi-robot allocation problems in precision agriculture [15]- [18].
Task allocation in a multi-agent system is an optimization problem with high complexity. It is classified as an NP-hard problem with difficulty in finding an exact solution [1]. Several approaches have been used to find the best solution to solve the task allocation problem in a multi-agent system. Some of the widely used approaches are the heuristic methods, such as the Auction-based method which is inspired by the economic system [3], [19]. The advantage of this method is that it has a high scalability. However, the required computational resources increase as the scale of the problem increases [14].
Other heuristic methods that have been used to solve the task allocation problem in multi-agent systems are inspired by natural events (bio-inspired), e.g. Genetic Algorithm (GA) [20] and Ant Colony Optimization (ACO) algorithms [7], [18], [21]- [23]. These bioinspired methods tend to require lower computational resources compared to other methods [14]. In general, GA can find the best solution for multi agent system's task allocation faster than other methods. However, the efficiency of the search process using the GA method decreases when the scale of the problem increases due to the increasing number of possible solutions built at the beginning of the iteration [24]. Another bio-inspired heuristic method, the ACO algorithm, uses heuristic information and learning mechanisms in the form of pheromone trails in finding the solution. Although the convergence rate at the beginning of its iteration is relatively slow, the efficiency of the search process carried out by the ACO algorithm improves as the pheromone trail increases [24].
The ACO algorithm is an optimization algorithm introduced by Dorigo [25]. It is inspired by the behavior of ants, i.e. using pheromone trails to find foods. Wang [21] introduced a modification of ACO algorithm to solve the task allocation problem in a multi-agent system by considering the distance factor between agents. Another study conducted by Sriatun An example of multi-target scenario is landslide disaster scenario, where there may be more than one targets (victims) to be rescued. Different from the single-target scenario, problem in a multi-target scenario may occur when there are one or more overlapping agent(s) chosen for different targets. To overcome this problem, it is necessary to add an objective function to minimize the number of overlapping agents which must be optimized simultaneously with other objective functions (multiobjective optimization problem). Several studies have shown that the ACO algorithm can be modified to solve multi-objective optimization problems [26]- [31]. This study aims to modify the ACO algorithm to solve the task allocation problem in a multi-target-multi-agent system. The final solution was obtained by optimizing all objective functions, i.e. minimizing the cost of task completion, minimizing the number of overlapping agents, and maximizing the system capabilities. Simulations were conducted to compare the efficiency of the modified ACO algorithm with the existing benchmark algorithms.

Ant Colony Optimization (ACO) for Solving Task Allocation Problems in Multi-Agent Systems
The ACO algorithm is one of the heuristic methods to solve optimization problems by imitating the behavior of ant colonies, i.e. utilizing pheromone trails to find foods. One of the implementations of ACO is to solve the widely-known Traveling Salesman Problem (TSP). Here, ants are represented as the artificial agents who travel through all the cities that must be visited to find the shortest route by utilizing pheromone trails [25]. The flowchart of the original ACO algorithm for TSP is depicted in Figure 1. As can be seen from the figure, initially, the ants randomly choose the starting point for its solution. Then, the ants use the pheromone values and the distances between the starting point and the candidate points to select the next point for its solution.  [32] In addition to solving TSP problems, the ACO algorithm was also developed to solve other optimization problems, including the task allocation problem in multi-agent systems [21], [23]. The task allocation problem in a multi-agent system is defined as an optimization problem to find the best agent coalition to complete a target. In this scenario, a target requires one or more capabilities of agent(s) to be completed, and an agent has one or more distinct capabilities. Agent capabilities are represented in a multi-agent capability matrix, MGK, where each row element g i k j indicates whether agent gi has capability kj or not. The value of g i k j indicates the weight of capability kj owned by agent gi. For example, g i k j ∈[0,10] and MGK= [ 6 0 1 3 8 10 ] indicates that there are two agents and three capabilities whose capability weight values are between zero to ten. The elements in the first line of MGK matrix contain information on the capability weights of the first agent, which correspond to capabilities 1, 2, and 3 [21], i.e. agent g1 has capability k1 with a value of six, capability k2 with a value of zero, and capability k3 with a value of three.
Similar to TSP, the selection of agents' coalition in a multi-agent system can be viewed as a graph trajectory search problem. For example, assume that there are V agents in a multi-agent system: G={g 1 , g 2 , g 3 ,…, g V }. A target w requires R capabilities to be completed: The group of agents that have capability kr to complete w is denoted as G w k r . Since w requires R capabilities to be completed, there will be R groups of agents. From each group of agents, one agent gi ∈ G w k r is then selected with ∈ [1, ] and r = {1, 2, 3,…,R}. To find the solution using ACO approach, each agent gi in each group of agents G w k r is represented as a node and the connecting path between agents in different groups is represented as the edge. The task allocation process is illustrated in Figure 2. . Illustration of a multi-agent system task allocation problem where sr is the number of agents with capabilities kr (r={1,2,,…,R}) [21] Wang [21] modified the basic ACO algorithm to solve the task allocation problem in multi-agent systems, i.e. the problem of resource allocation in cloud computing. The algorithm is called the Collective Path Ant Colony Optimization (CPACO) algorithm [21]. In [21], the problem occurs in a dynamic and rapidly changing environment so that the chosen coordination type is the decentralized coordination. Here, the coordination process is distributed among all agents so that the agents need to communicate with each other in determining the best agent coalition. Thus, to produce the optimal system performance, modifications are carried out by adding the weight of agents' capabilities and the communication cost between agents [21].
The study by Sriatun [23] further developed the CPACO algorithm to solve the task allocation problem of a multi-agent system in a landslide disaster scenario. In this scenario, the victim (target) must be rescued by a multi-robot system. To produce the best agents (robots) coalition, the CPACO algorithm is modified by adding a travel cost factor, i.e. the distance between the chosen agents/robots and the target. Thus, the modified algorithm in [23] considers not only the weight of agents' capabilities and the communication cost between agents, but also travel cost between the agents and the target. The modified CPACO algorithm by Sriatun [23] is hereinafter called the CPACO-S algorithm.
In the CPACO and CPACO-S algorithms, the ants in the colony perform a solution-finding process based on the transition probability as in any general ant algorithms. The m th ant will move from agent gi to agent gj at time t with the probability calculated by Equation (1) as follows [21], [23]: where ij(t) is the value of pheromone at the path i-j from agent gi to agent gj at time t, and α is a parameter to adjust the effect of pheromone value (α ≥ 0). The notation ij(t) is a heuristic function that represents the feasibility for the transition from agent gi to agent gj at time t based on some known information, and β is a parameter to set the effect of the feasibility value (β ≥ 1). In Equation (1), h is a member of the set N which contains all the agents that belongs to the next group of agents, i.e. agents that have one same capability that is still required to be fulfilled for completing the target. In other words, the group of agents for capabilities that have not been "visited" by the m th ant. At the beginning of the iteration, the initial pheromone value is defined as 0 = 1⁄ where a is the number of nodes (agents) and Lagen is the total distance of each agent to all other agents.
The CPACO algorithm modified the heuristic function ij(t) of the basic ACO algorithm by using the weighted capabilities of each agent in the numerator and the communication costs in the denominator as depicted in Equation (2) [21]: In CPACO-S algorithm, the heuristic function in (2) is further developed by adding the travel cost from agents to target as shown in Equation (3) [23]: Here, the travel cost is proportional to the distance between agent gi and agent gj to the target w (diw, djw) with a factor of ω 2 .
In both CPACO and CPACO-S, an iteration is completed when all ants have reached the last group of agents, that is, the last capability required by the target. Then, the best agent coalition is determined by evaluating the efficiency values of all candidate solutions formed by each ant. The efficiency value for the CPACO algorithm is shown in Equation (4) as follows [21]: The numerator in Equation (4) is the sum of the weighted capabilities of all selected agents as the candidate solution by the m th ant. The denominator in Equation (4) is the sum of the communication costs between agents in the same candidate solution. Note that the target requires R number of capabilities to be completed, which corresponds to the number of agents in the candidate solution (number of agents in candidate solution set <= R).
In the CPACO-S algorithm, the efficiency value of the CPACO algorithm is modified by adding the travel cost between the agents in the candidate solution and the target, as shown in Equation (5) [23]: The numerator in Equation (5) is the sum of the weighted capabilities of all agents in the candidate solution by the m th ant. The denominator in Equation (5) is the sum of the total communication costs between agents and the total travel costs between agents in the candidate solution and the target.
In both CPACO and CPACO-S, the efficiency values of all candidate solutions in the colony are calculated at the end of an iteration. The highest efficiency value in each iteration corresponds to the best agent coalition in that particular iteration. This efficiency value is then used as the basis for updating the best efficiency value and the best agent coalition from all iterations.
As in other general ant algorithms, in addition to updating information about the best efficiency value and the best agent coalition at the end of the iteration, the CPACO and CPACO-S algorithms also update the pheromone value. This value is updated at the end of each iteration to increase the efficiency of the solution search process. CPACO and CPACO-S algorithms use the same equation as used in other general ant algorithms to update the pheromone value, as written in Equation (6) [21], [23]: In Equation (6), the pheromone value of the path i-j in the ant algorithm consists of the pheromone evaporation value, which is influenced by the degree of evaporation (ρ), and the accumulated values of pheromone addition (∆ ( )) [26]. To calculate ∆ ( ), the value of is used as written in Equation (7) [21], [23]: In Equation (7), Q is a constant value to determine the strength of the pheromone and is the efficiency value of ant m th 's route. After updating the best efficiency value, the best agent coalition and the pheromone value, the algorithm will then proceed to the next iteration. The iterations are repeated again until the algorithm termination criteria have been met. When the termination criteria is met, a candidate solution from the ants with the best value is then selected as the best agent coalition to complete the target. Note that both the CPACO and the CPACO-S algorithms consider only a single target.

Developing ACO for Task Allocation Problems in Multi-Agent Systems for Multi-Target
This study carried out four main stages. These stages are problem identification, ACO algorithm modification, simulations as well as analysis and evaluation. The flowchart of the four stages that we carried out in this study is shown in Figure 3. The following assumptions were used in this study. The problem that we consider involves several heterogeneous agents that work together to complete tasks which require certain capabilities. The coordination between agents is assumed to be centralized. The process of determining the allocation of agents to task is carried out by a server or a central control system. It is assumed that the agents communicate with each other during the task completion process and therefore we consider the communication costs. The communication cost factor was also used to determine the best agent coalition in the CPACO and CPACO-S algorithms.
In addition to the communication cost between agents, we also consider the distance between the agent and the target. However, in contrast to CPACO-S, which uses the weight of the communication factor ω 2 to determine the travel cost, we consider an additional variable ω 3 , which is the weight of agents' transition factor. This variable is employed to calculate the travel cost because the agents' movement towards the target during the task completion process may be influenced by other factors besides the distance between the agent and the target. Note that the weight of the agent transition factor is Variables used in this study for task allocation problem are as follows: • A set of agents in a multi-agent system: There are a number of V agents in a multi-agent system which is denoted as G = {g 1 ,g 2 ,g 3 ,…,g V }.
• Targets: There are a number of Z targets in a multiagent system environment. A set of targets is written • Capabilities to complete tasks: Each target requires a number of Rz = R1, R2, …, RZ capabilities. The set of capabilities required for target wz is denoted as K w z containing the capabilities kr where r is the index of capabilities. • Group of Agents: A group of agents with the capability kr required for target wz is denoted by G w z k r .
• Agent coalition: A coalition of agents for target wz is denoted as G w z . • A set of agent coalitions for all targets is denoted as: • The collection of the best agents' coalition that is the solution to the task allocation problem in a multiagent system with multi-targets is denoted as G w best .
A problem that arises when there is more than one target in a task allocation problem is the problem of overlapping agents, i.e. when the same agent(s) are selected to an agents coalition for different targets. For example, assume that there are two targets w1 and w2 that require some capabilities to complete the task, i.e. K w 1 ={k1, k3, k7} and K w 2 ={k1, k2, k7}. The agents in the multi-agent system are then grouped according to the capabilities required by each target. For target w1, the agent group formed for example are G w 1 k 1 ={g 2 ,g 5 ,g 6 ,g 8 }, G w 1 k 3 ={g 1 , g 2 , g 4 ,g 7 ,g 8 } and G w 1 k 7 = {g 5 ,g 7 ,g 8 , g 9 ,g 10 }. For target w2, the agent group formed are G w 2 k 1 = {g 2 ,g 5 ,g 6 ,g 8 } , G w 2 k 2 = {g 3 , g 5 ,g 6 ,g 8 ,g 10 }, and G w 2 k 7 = {g 5 ,g 7 ,g 8 , g 9 ,g 10 }. From each group of agents, if only the weight of the agents' capability and task completion cost are considered as in the CPACO-S algorithm, then the best solution obtained are G w 1 ={g 2 ,g 2 ,g 5 } and G w 2 ={g 2 ,g 3 ,g 5 }. We can see that from those agent coalitions for the two targets, agent g 2 and g 5 are selected for both targets. This condition is called overlapping agents, which has to be minimized so that the task completion process on all targets can be carried out in the shortest possible time. In order to search for a suitable solution, an objective function is added, which is used to minimize overlapping agents.
In this study, the objective functions are designed to: (1) minimize the task completion costs, (2) maximize the system capabilities, and (3) minimize overlapping agents. The task allocation problem that we consider can, thus, be considered as a multi-objective optimization problem. All objective functions must be optimized simultaneously to obtain the best solution. The final solution is represented in the form of agents' coalition that optimizes all objective functions.

ACO Algorithm Modification for Task Allocation Problem in Multi-Agents System with Muti-Targets
The basic ACO algorithm in this study is modified to solve the task allocation problem in a multi-agent system with multi-target. The proposed model is then referred as the Modified ACO Model. Generally, there are five elements which are specified to define a suitable ant algorithm for different optimization problem [28].
The first element is constructing a candidate solution. As mentioned earlier, the final solution for the task allocation that we consider is to form an agent coalition for each target that optimizes all objective functions. To minimize overlapping agents, finding the final solution is done by finding a solution for each target. Once the solution for one target is obtained, we calculate the best solution for the next target. Therefore, the number of ant colonies used to construct candidate solutions is as much as the number of targets. Each ant colony seeks a solution for a target by forming a candidate solution for the corresponding target.
The second element is the heuristic function. In this study, more than one heuristic function is used to determine the visibility value of the ant transition. Each heuristic function is affected by a different objective function. The first heuristic function is influenced by the objective function of minimizing task completion costs consisting of communication costs and travel costs as used in the CPACO-S algorithm. The heuristic function of the CPACO-S algorithm is modified to maximize the capability of the system by increasing the probability of selecting the same agent, as defined in Equation (8), The numerator of Equation (8)  w1 requires four capabilities, and one of the candidate solutions is G w 1 ={g 2 ,g 2 ,g 2 ,g 5 }, then the travel cost for agent g 2 will be counted only once.
Note that for the same agent, the value of the communication cost is considered zero because the distance between the agent and itself is zero. For different agents, the heuristic function used is the same as the heuristic function in the CPACO-S algorithm, with modifications to the variables used to calculate the travel cost.
In CPACO-S, the variable used for the travel cost is the same variable for the weight of the communication factor (ω 2 ). In the Modified ACO Model, a new variable is used for travel costs, which is the weight of the agent transfer factor (ω 3 ). In this study, a second heuristic function is added to minimize overlapping agents, which is defined as follows: where variables and in Equation (9)  In addition, variable is used to consider the selection of agent gj in other targets. Its value increases when agent gj is chosen more often in other targets wy with y = {1,2,…,Z} and y ≠z. Thus, the visibility value is getting smaller and allows the selection of other agents whose visibility value is relatively small but have never been chosen or are less chosen in the agent coalition for other targets.
The total ant route transition visibility from agent gi to agent gj for target wz at time t (η ij z (t)) is calculated as follows: The third element is the efficiency function which is a function to measure how good a solution is. As mentioned earlier, the process of finding a solution for each target is done one-by-one. The best solution for each target is calculated using the efficiency function that we refer as the local efficiency function.
The local efficiency function is determined based on the objective function to be optimized. As there is more than one objective function that we consider, we use more than one local efficiency function to determine the best agent coalition for a target. The first local efficiency function is determined by the objective function to minimize the task completion cost as employed in the CPACO-S algorithm.
To maximize the overall capabilities of the system, the total task completion cost is calculated by considering the travel and communication costs. If an agent is selected to complete multiple capabilities, that agent is only listed once in the set of the candidate solution. The first local efficiency function is defined as follows: The numerator of Equation (11) is the total number of agent capabilities of the candidate solution chosen by the m-th ant for the wz target.
The numerator is calculated based on all capabilities kr that are members of the capability set required to complete the task on target wz (∇k r ∈ K w ).
The denominator of Equation (11) is the total task completion cost for the target wz of the candidate solution formed by the m-th ant by considering different agents.
The communication cost (ω 2 d ij ) is calculated based on the total distance between agents gi ∈ G w z k r for ∀k r ∈ K w . The travel cost (ω 3 d iw ) is calculated based on the distance between all agents gi ∈ G w z k r for ∀k r ∈ K w to target wz . The additional condition that if there is one agent selected for multiple capabilities, the travel cost for that agent will only be counted once.
The next objective function is designed to minimize overlapping agents. This function is defined as the second local efficiency function as follows: where 2 is influenced by the value of variable ; see Equation (9). When candidate solution of the m-th ant is consisted of the same agents which are selected for several different targets, the value of 2 becomes smaller.
Based on the two efficiency functions, the total local efficiency value of the m-th ant for target wz is calculated as: = 1 × 2 . Then after the candidate solution for the wz target has been formed for all ants in a colony, the best solution candidate is selected based on the biggest value, which is the best local efficiency value for the wz target ( ).
In addition to the local efficiency values, a global efficiency value ( ) is used to determine the best overall solution. The values of all targets are added, which then produce value in each iteration. The value of is then compared at the end of each iteration to obtain the best overall efficiency value ( ).The fourth element is the probability of ant transition, which is influenced by the ant transition visibility value (heuristic function) and the value of the pheromones. In this study, the probability of ants m-th movement from agent gi to agent gj in the process of finding solution for the target wz at time t p ij z (t) is calculated using Equation (1) with the value of visibility η ij z (t) and the pheromone valueτ ij z (t).
The last element is the pheromone update rule. Each ant colony uses a different pheromone matrix to store the accumulated pheromone values to avoid selecting the same agent for different targets. The equation used to update the pheromone value on the path between agent gi to agent gj for target wz ( ( + 1)) is the same as CPACO and the CPACO-S algorithms in Equation (6) by considering the pheromone value on the path to find solution for target wz ( ( )). In this study, the pheromone update rule is applied on the ant paths that are formed by only the best agent coalition for target wz.
If the ant in the best agent coalition for target wz moves from agent gi to agent gj, the amount of pheromone deposited at time t is calculated as ∆ ( ) = ; otherwise, the value is zero.
From the description of each element in the ACO algorithm developed in this study, it can be concluded that the modifications are mainly carried out on the heuristic and efficiency functions. In summary, the pseudocode of the proposed Modified ACO is shown in Figure 4.   The final solution generated by the Modified ACO Model is an agent coalition from all targets that produce the . The process of finding the final solution is carried out one by one for each target. When an agent coalition is define for a target, the results affect the process of selecting the agent coalition for the next target. Thus, it is possible that the sequence of finding solutions affects the selection of an agent coalition for all targets. Testing needs to be carried out to determine the effect of finding solutions sequences on existing targets so that the best Modified ACO Model is achieved to solve the task allocation problem of multiagent systems with multi-target.

Simulation
Simulations were carried out using Matlab R2015a software to test the performance of the Modified ACO Model in solving the multi-agent-multi-target task allocation problem. In the simulation, the problem was generated in a two-dimensional area with 0 ≤ x ≤ 50 and 0 ≤ y ≤ 50. The multi-agent system consists of ten heterogeneous agents with ten types of capabilities. To simplify the simulation, an agents' capability is represented in binary number, i.e. 1 represents that the agent has a certain capability whereas 0 represents that the agent has no capability of a certain type. With this binary weighting system, the heuristic function η ij1 z (t) in Equation (8)  Equation (11) can be simplified into Equation (13) and Equation (14).
Here, the value for each agents' capability (g i k a , g i k b , g i k r for r = {1,2,…,R}) in Equation (8) and Equation (9) is equal to one. Therefore, in Equation (13) and Equation (14), we need to only consider the importance of the agents' capability in solving the target, i.e. ω a 1 , ω b 1 , ω r 1 for r = {1,2,…,R}.
In the simulation, information on the agents' capabilities is written in a MGK capability matrix, as shown in Table 1. Meanwhile, the information on the position of each agent is shown in Table 2.   10 15 Two simulations were carried out: 1. Simulation to analyze the process of finding the optimum solution using the proposed Modified ACO Model, and 2. Simulation to evaluate the proposed Modified ACO Model and its comparison to the benchmark algorithm, i.e. CPACO and CPACO-S.
In the proposed Modified ACO Model, the optimum solution for each target is calculated iteratively, starting from the first target, the second target, and so on. Therefore, the determination of target sequence may affect the selection of agents' coalitions. To analyze this issue, in the first simulation, the target sequence was determined using two approaches: (1) based on initial information and (2) randomly. Four tests were carried out for each approach with different number of targets, i.e. 5, 10, 15 and 20. Each test was simulated 30 times with random target combination.
In the second simulation, evaluation to the proposed Modified ACO Model was done by comparing this algorithm to a benchmark algorithm, namely the CPACO-S [23], which is proven to be more efficient than the original ACO algorithm and the CPACO algorithm. For each algorithm, six tests were conducted by varying the number of targets, i.e. 3, 5, 8, 10, 15, and 20. Each test was simulated 30 times with random target combination.
In the first and second simulations, the targets were taken randomly from the following data: 1. The data contains 30 targets with different positions and capability requirements to be completed. 2. Target positions were within the simulation area, i.e.
0 ≤ x ≤ 50 and 0 ≤ y ≤ 50. 3. None of the targets was in the exact same location as other targets or agents in the simulation area. 4. Each target requires more than one capability to be completed and there might be more than one target with the same capability requirement.

Analysis dan Evaluation
Two algorithmic performance evaluation metrics were utilized to analyse the simulation results, those ares: 1. The total value of the task completion cost, which is the sum of the communication costs and travel costs of the agent coalitions of all targets.
Note that the agent coalition on a target contains all the agents selected to complete the target, and that if there is one agent selected for multiple capabilities, the travel cost for that agent will only be counted once. The total task completion cost (Σtcc) is calculated using Equation (15): (15) where W is the set of all targets to be solved and G w z is the best agent coalition for target wz. In Equation (15), 2 represent the communication cost between two consecutive agents gi and gj, which is proportional to the distance between agent gi and agent gj (dij) with a communication weight factor (ω 2 ). Meanwhile, ω 3 d i represents the travel cost, which is proportional to the distance between agent gi and target wz with a factor of ω 3 . 2. The overlapping value (ϑ).
where refers to the maximum number of targets that select the same agent in the system. For example, if agent gi is selected by three targets, agent gj is selected by five targets, and other agents are selected by one target only, then the value of = 5. This is then multiplied by the ratio between the number of targets with overlapping agents and the total number of targets.
Evaluation of the simulation results was then carried out using the two performance matrices as described in Equation (15) and Equation (16).

Simulation 1
In the first simulation, we determined the target sequence for solution finding using two approaches. The first approach is based on the initial target information without any change in the target sequence. In other words, first target would be the first target as listed on the initial data. The second approach is by randomly determine the target sequence for the target solution, which is changed in each iteration. Each approach of determining the target solution search sequence was tested with four different number of targets, i.e. 5, 10, 15 and 20. Each test was simulated 30 times with different combinations of targets, taken from the data from 30 targets. The results are shown in Figure  5 and Figure 6, and the summary of the results is shown in Table 3.
Using a 95% confidence interval, Figure 5 and Table 3 show that the total average value of the task completion cost for random target sequencing is significantly lower than initial target sequencing. The total average values of task completion cost by determining the target sequence randomly are 11.08%, 17.41%, 14.28% and 20.84% lower than that of initial target sequence for 5, 10, 15 and 20 targets, respectively. Figure 6 and Table  3 show that random target sequencing returns slightly lower overlapping values compared to the initial target sequencing. As a conclusion, random target sequence determination is superior to target sequence determination based on initial information. These results also show that random target sequence determination allows better solution search so that a more optimum agent allocation can be found.

Simulation 2 Scenario
In the second simulation, the proposed Modified ACO Model is compared with CPACO-S algorithm which has been proven to be superior to the basic ACO algorithm and the CPACO algorithm in terms of the efficiency of the resulting agent coalition [23]. Six tests were conducted for each algorithm using different number of targets, i.e. 3, 5, 8, 10, 15, and 20. The results are shown in Figure 7 and Figure 8. The summary of the test results is shown in Table 4.   Figure 7 and Figure 8 show that the Modified ACO Model in this study has a better performance than the CPACO-S algorithm in terms of the total average task completion cost and overlapping values. At 95% confidence interval, Table 4 shows that the Modified ACO Model produces a significantly lower total average task completion cost in several tests. The superiority of the Modified ACO Model is significant when the number of targets is less than the number of agents. The Modified ACO Model produces less task completion cost compared to the CPACO-S by 24.95%, 15.30%, 11.50%, 10.17%, 4.62% and 4.67% for 3, 5, 8, 10, 15 and 20 targets, respectively. When the number of targets increases, the total average task completion cost of the Modified ACO Model is not significantly better to that of the CPACO-S algorithm. This may be due to the limited number of agents available, so that the allocated agents to a specific target may not be the agents that produce the best task allocation cost. However, since our modifications maximize the agents' capability and minimize agents' overlap, the efficiency of the agent coalition generated by the proposed Modified ACO Model becomes superior to the CPACO-S algorithm.
In this study, ACO algorithm has been modified to minimize the number of overlapping agents, i.e. an agent selected by different targets. In addition, modifications were also made to prioritize the selection of the same agent in a agent coalition to minimize the number of agents allocated to a target. Figure 8 proves that the two modifications in the proposed Modified ACO Model have succeeded in minimizing the overlapping agents. It is evident that the Modified ACO Model produced a significantly lower average overlapping value than that of the CPACO-S algorithm at a 95% confidence interval, as also shown in Table 4. This indicates that the Modified ACO Model is more efficient in allocating agents to targets than the CPACO-S algorithm. Overall, it can be concluded that the Modified ACO Model proposed in this study performs better than the CPACO-S algorithm in solving the task allocation problem of multi-agent system with multi-target.

Conclusion
This study proposes a Modified ACO Model to solve the task allocation problem in a multi-agent system with multi-target. Modifications are made to minimize the number of overlapping agents, where the same agent is selected to solve some different targets. The simulation results show that the proposed Modified ACO Model has a superior performance in minimizing task completion costs by ±11.87% than the benchmark algorithm (CPACO-S). Furthermore, the simulation results also show that the Modified ACO Model performs well in minimizing overlapping agents with a lower overlapping value of ±55.11% compared to the benchmark algorithm.
Further research can be conducted to evaluate the performance of the proposed Modified ACO Model in comparison with other well-known optimization methods to solve the task allocation problems in multiagent systems with multi-target. Some benchmark optimization methods may include: (1) the Auctionbased method which is inspired by economic system and (2) Genetic Algorithm (GA), which is a bioinspired based algorithm.