Autonomous mobile manipulation requires the integration of navigation, mapping, perception, and localization into a unified system. In this work, these components are orchestrated using a behavior-based control framework.
Navigation¶
Navigation refers to the process of planning and executing collision-free motion from a start to a goal position. It is typically decomposed into:
Global planning: computing an optimal path on a map.
Local planning: generating feasible velocity commands.
Modern navigation systems use costmaps and search-based planners such as A* or Dijkstra. In Nav2, navigation is modular, allowing different planners and controllers to be interchanged. Since Local planning can mostly be left to Nav2 for this section only some theory is discussed on how coverage path planning is done in this case.
Global Planning Methods¶
Global Path planning includes two major steps. Namely Decomposition and Path planning
Mapping¶
Mapping is the process of constructing a representation of the environment. In unknown environments, this is often addressed using Simultaneous Localization and Mapping (SLAM).
A common representation is the occupancy grid, where each cell encodes the probability of being occupied. Mapping can be expressed probabilistically as:
where:
is the map
are sensor measurements
are robot poses
Different SLAM approaches (e.g., grid-based vs feature-based) trade off accuracy, computational cost, and robustness.
Localization¶
Localization estimates the robot’s pose within a known map. A widely used approach is probabilistic localization, such as particle filters.
The recursive Bayesian update is given by:
where:
is the robot pose
is the observation
is the control input
In practice, methods like Adaptive Monte Carlo Localization (AMCL) are commonly used within Nav2.
Perception¶
Perception enables the robot to interpret sensor data to detect and classify objects. This includes:
Object detection: identifying candidate objects in the scene
Classification: assigning semantic labels (e.g., electronics vs other)
Libraries such as OpenCV and Open3D are commonly used for image and point cloud processing.
Perception outputs are used by higher-level decision systems to guide actions such as navigation and manipulation.
equation example:
We can link to equations using their labels, like equation (4) or with more emphasis: eq 4. See the documentation for more options with using formulas. You might be interested in specific ways of numbering.