Ir al contenido

Documat


Resumen de Plan merging via plan reuse

Nerea Luis Mingueza

  • In a world where people are more connected and move faster each day, we are constantly surrounded by multi-agent scenarios in our daily life: robots that solve tasks on-demand inside a building, luggage routing inside an airport, food/package delivery, warehouse organization, cars on a parking lot, drone swarms, etc. Through this Thesis we present a solid multi-agent approach that can be applied to a broad set of environments.

    Since Artificial Intelligence was born in the early 50s, one of the challenges the researchers have been facing is how to replicate the act of human reasoning. Understanding how our brain works is still not easy nor obvious. We humans make decisions daily in order to perform ``actions'' that will affect -usually in a positive way- to ourselves and/or the environment that surrounds us. Some decisions will require more effort and coordination to be successfully performed than others, especially if they involve several people. For instance, when ordering a pizza using a smartphone you will usually do the following: 1. Click on the app of the pizza restaurant of your choice and log in 2. Customize your pizza 3. Choose your payment method 4. Confirm and wait for your pizza at home Although the sequence of steps might be tricky if you are not familiar with ordering online, the reasoning process is trivial. We call deliberative reasoning to the process of inference that selects which set of ordered actions will be performed in order to reach some objective. Deliberative reasoning has multiple levels of abstractions i.e customize your pizza involves looking for available ingredients, to choose your favorite ones, to choose the kind of dough etc. The deeper the abstraction, the less deliberative is the behavior to undertake. Low-level actions, such as picking up an ingredient, belong to the reactive reasoning, which means that they barely need of any inference or thinking component to be performed.

    Automated planning is the field of Artificial Intelligence that studies the processes of deliberative reasoning. The planning algorithms are able to automatically generate, synthesize and organize a set of actions (plan) that, if applied in order, make the system transit from an initial state to a state where goals (objectives) are achieved. Regarding the previous example, as the plan is performed by one human (agent) only, it belongs to the subfield of single-agent planning.

    Meanwhile, in the pizza restaurant, the situation is more complicated to handle. From the point of view of deliberative reasoning, a team of humans has to coordinate several times to: 1. Complete each received order 2. Delivery them to the correct destination address 3. Replenish ingredients/doughs/resources when needed One of the first ways to tackle this problem would be to have a team in charge of doughs, another team for customizing each pizza and one last team for delivery. Each of the teams would have its own plan to carry out its mission. There is still one last issue: communication. Teams could decide to either communicate everything they do or only communicate when something unexpected happens (usually conflicts). In Automated Planning, when multiple agents are involved, we talk about Multi-Agent Planning (MAP).

    This subfield studies not only the reasoning process but also the coordination/cooperation among agents, the communication (especially if some information is private) and the conflicts' resolution. In a world where the digital components and the autonomous/intelligent systems are scaling quickly, multi-agent scenarios are given more often. Here we present some real-life examples where MAP could be applied: • Logistics problems such as warehouse management where robots help workers to complete the customers' orders e.g. Amazon's Kiva Robots or Alibaba's Quicktrons.

    • Transportation problems such as delivery or school bus routes.

    • Surveillance tasks in buildings with multiple robots or in open environments using Unmanned Air Vehicles.

    • Service robots to assist people in public/private buildings e.g. Cobots or STRANDS.

    • Bots behavior in games, simulations or real-life games.

    Therefore, we can say that Multi-agent planning (MAP) aims at solving planning tasks for/by a set of agents. Dealing with a set of agents instead of planning for a single agent involves some issues that directly affect to the design of the algorithm. A common classification groups these issues into six features: agent distribution (which are the agents or executors?), computational process (centralized or distributed), agents' coordination (before, during or after planning), communication (internal, no communication), heuristic search (per agent or global) and privacy preservation (private information, cipher communication, obfuscation).

    One of the problems that MAP faces is scalability. Having a considerable amount of agents and goals produces bottlenecks on task assignment, planning and communication. Depending on the level of interaction among agents, communication could be avoided as long as there exists some algorithm to deal with interactions/coordination.

    In big-size scenarios, exploring the search space implies to spend a considerable amount of planning time. Therefore, skipping communication might alleviate the cost of solving the planning task. Also, dividing the goals among the agents in an efficient way e.g. minimizing potential interactions, will make the planning tasks easier to solve. Thus, if after goal-assignment agents can plan individually, the set of plans will be more likely to be parallelized. There is an Automated Planning metric that measures the number of execution steps of a plan, which is called makespan. Minimizing the makespan in this kind of environments will result into having efficient parallel plans that, when executed jointly, will solve the planning task.

    The main idea of this Thesis consists on extending some previous works on multi-agent and single-agent planning. Specifically, we want to focus on solving the MAP scalability problem mentioned above. We will focus on big-size multi-agent tasks in terms of number of agents, goals and search space, where agents have little or no interaction among them and the goal is to improve the makespan. As a result, the search space is big, and the number of agents and goals are considerable. We aim to skip agents' communication by solving interactions that might arise using plan-merging and plan-reuse after the planning phase. Also, in order to obtain efficient parallel plans, our proposal will focus on minimizing the makespan metric.

    Specifically, we contribute with (1) a MAP approach that is able to automatically adapt itself to different multi-agent scenarios using plan-merging and plan-reuse; (2) a new single-agent planning approach that combines heuristic search and plan-reuse, which can be integrated into the former and (3) an adaptation of our approach for robotics environments based on path-planning tasks.

    In order to achieve that, the particular steps which we aim to accomplish are:

    • Design a new MAP approach that automatically adjusts to the interaction level among agents and goals. We will focus on combining distributed and centralized MAP techniques to solve agents' interactions.

    • Explore plan-reuse algorithms to fix agents' interactions. We will focus on post-planning coordination of agents to avoid agents' communication during planning. Thus, a new plan-reuse approach will be proposed.

    • Make an analysis on efficiency, scalability, modelling tasks and environment, and agents' features to identify the strengths and weaknesses of our proposal. We will specially focus on the size of the planning task, measured as the number of agents and goals. • Study a real-world use case to apply our approach. In order to explore and contribute to other areas of knowledge, we will study the robotics surveillance problem.

    We have briefly summarized the motivation behind this Thesis, the main objectives and some observations about the research background. Now we describe the main contributions of this Thesis:

    The first contribution of this Thesis is the multi-agent planner Plan Merging by Reuse (PMR), an algorithm capable of solving a MAP task by merging individual plans and applying plan-reuse techniques.

    PMR focuses on solving classical deterministic planning tasks, where a set of agents should find a common plan. The objectives are to: • Efficiently solve MAP tasks by combining distributed and centralized MAP techniques.

    • Focus on big-size multi-agent tasks in terms of number of agents, goals and search space, where agents have little or no interaction.

    • Directly reuse existing planning techniques without further code modification.

    • Effectively apply factorization to divide the main task into subtasks regarding some minimization criteria such as plan length or makespan.

    • Automatically adjust to the interaction level among agents and goals.

    In order to satisfy those objectives, we run up to three different planning processes inside our first contributions. In summary, our approach work as follows. First, a preprocessing step is performed to divide the MAP task among the agents. Then, PMR performs an individual planning phase run by each agent (distributed part), a plan merging phase to generate a joint plan and the validation of that plan. When validation fails, PMR runs a plan-reuse episode. Only when the factorization or the individual planning phase fail, PMR runs a centralized planning phase. Regardless of the chosen path, PMR parallelizes the resulting plan on the last step and the output of this process is returned as the solution. For the sake of clarity, we are not interested in privacy preserving nor joint tasks.

    Regarding the preprocessing step, first the agents of the problem are identified in order to divide the planning task. Besides, in Multi-Agent Systems or robotics, some approaches perform task allocation (assignment of each public goal to a single agent) to improve the efficiency of problem solving. Inspired by that, PMR has a factorization step, which consists on dividing the goals among the agents by following some strategy.

    A key feature of PMR is that automatically adapts to the interaction level among agents and goals, varying its behavior from distributed to centralized. It generates individual plans and merges them in the merging phase (M) with a low computational effort if the domain has a low degree of interaction. Otherwise, it uses plan reuse (R) in domains with more interaction and resorts to centralized planning (C) in case of domains with strong interactions. Moreover, PMR can easily be configured to target coverage, cost or makespan by just changing its goal allocation strategy. The aim was to focus on minimizing the makespan in big-size problems that have low interaction. Another advantage of PMR is that it only includes off-the-shelf planners on its three phases (M, C and R). Hence, we can trivially improve the performance of PMR by just changing the planners used by better ones once they are developed.

    The second contribution of this Thesis is the plan-reuse planner Reuse Randomly-exploring Planning Tree (RRPT-plan), an algorithm that combines plan reuse, sampling and local search to solve a planning problem. RRPT-plan receives the domain, the problem and an input plan (usually invalid) from which it will try to reuse actions to include in the final plan. We have shown in the experiments that RRPT-plan adapts to diverse plan reuse scenarios, including the ones that are not usually considered by state-of-the-art plan reuse planners.

    Among the spectrum of different scenarios that can be given in trying to repair a plan, the most frequent one is when the invalid input plan is very similar to the final solution. Plan reuse will be very efficient as most of the final plan's actions are already on the invalid input plan. Thus, the planner reuses most of the invalid plan actions and has only to include a small set of new actions to transform it into a valid one. However, sometimes, the solution plan should be completely changed. In order to increase efficiency, combining search and plan-reuse would help any plan-reuse planner to better solve a wide variety of scenarios: from the ones that are very similar to the ones that look similar, but they are not.

    Our contribution is inspired in two previous planners: ERRT-plan and RPT. RRPT-plan emulates the ERRT-plan behavior by receiving as input the invalid plan, the domain and problem descriptions and the set of probabilities. RRPT-plan combines search, plan reuse and sampling using the state-of-the-art planner Fast Downward, which was the base planner used by RPT.

    RRPT-plan has three parameters: • Ɛ: limits the number of expanded nodes of the local search.

    • p: probability of executing local search towards the nearest goal.

    • r: probability of executing plan reuse.

    Depending on the initial values of the set of parameters, RRPT-plan performs either search, sampling or plan reuse on each iteration. The result will be a valid plan based on the input plan. As RRPT-plan is a plan-reuse planner, it can be included inside PMR.

    Experiments are run with RRPT-plan inside and out of PMR. Thus, the first part of experiments is dedicated to plan-reuse and an analysis of RRPT-plan’s parameters. Then, we rerun the Competition of Distributed and Multi-Agent planners (CoDMAP) to compare against the original winners. After that, we show some results on changing agentification and increasing the number of agents of a given problem. Finally, we focus on hard problems, where the number of agents, goals and search space considerably increases.

    Actually, the potential of PMR was neither appreciated on CoDMAP results nor on the experiments that changed the agentification. Thus, we moved into harder problems in terms of size. Therefore, the results of PMR-RRPT-plan have shown that it easily adapts to any type of MAP problem, independently of the problem's features (e.g. number of agents, goals, interactions). It specifically adapted with success to those that had little interaction and a topology that fostered the goal-assignment to divide indirectly the available space of actuation among the agents. PMR-RRPT-plan maintains good results on coverage and time and remarkable results on makespan, especially in combination with the Load-balance strategy.

    Last, we present a joint work with Manuela Veloso's CORAL lab, where our first contribution, PMR, was combined with Actuation Maps to speed up the planning process and work efficiently on big multi-agent environments. The aim is to solve multi-robot path planning problems. The key was to skip the computation of estimated costs during planning, which was our main bottleneck to scale. Given a building map, the Actuation Map is the image representation of the reachability limit of a robot's motion and actuation over a map. Thus, Actuation Maps can determine the regions each agent can actuate on. As a result, specific information can be extracted to know which goals can be tackled by each agent, as well as cheaply estimating the cost of using each agent to achieve every goal.

    We employed Actuation Maps in a preprocessing step to determine the feasibility of pairs robot-goal and to extract an estimated cost. That cost was used later to avoid the computation of relaxed plans during Goal Assignment. The environment map was discretized into a grid of waypoints. The goals were distributed thanks to a goal-allocation algorithm and unfeasible goals identified and discarded from the planning task. Then, the planning task was factorized for each robot. Each robot generates its individual path, that results in a maximal space coverage in terms of actuation. We also contributed with and adapted version of PMR to fix agents' interactions after the individual planning phase. On the experiments we designed a total of six scenarios, five for circular robots and one for any-shape robots.

    Our approach greatly reduces the Goal Assignment time. We were able to also reduce the overall planning time when preprocessed information was provided to the MAP algorithm. The gains in performance depend greatly on the topology of the environment and the characteristics of each robot.

    Finally, experiments from the third contribution also proved that when solving big size multi-agent problems using planning, it is essential to first factorize the problem into subtasks. Also, it is helpful when working on problems that explicitly involve agents' interactions. Experiments on collision avoidance showed the importance of task factorization and the topology of the scenario in order to successfully fix collisions.


Fundación Dialnet

Mi Documat