Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

2 Theory

Delft University of Technology
Updated: 04 May 2026

Autonomous mobile manipulation requires the integration of navigation, mapping, perception, and localization into a unified system. In this work, these components are orchestrated using a behavior-based control framework.


Navigation refers to the process of planning and executing collision-free motion from a start to a goal position. It is typically decomposed into:

Modern navigation systems use costmaps and search-based planners such as A* or Dijkstra. In Nav2, navigation is modular, allowing different planners and controllers to be interchanged. Since Local planning can mostly be left to Nav2 for this section only some theory is discussed on how coverage path planning is done in this case.

Global Planning Methods

Global Path planning includes two major steps. Namely Decomposition and Path planning


Mapping

Mapping is the process of constructing a representation of the environment. In unknown environments, this is often addressed using Simultaneous Localization and Mapping (SLAM).

A common representation is the occupancy grid, where each cell encodes the probability of being occupied. Mapping can be expressed probabilistically as:

p(mz1:t,x1:t)p(m \mid z_{1:t}, x_{1:t})

where:

Different SLAM approaches (e.g., grid-based vs feature-based) trade off accuracy, computational cost, and robustness.


Localization

Localization estimates the robot’s pose within a known map. A widely used approach is probabilistic localization, such as particle filters.

The recursive Bayesian update is given by:

p(xtz1:t,u1:t)p(ztxt),p(xtut,xt1)p(x_t \mid z_{1:t}, u_{1:t}) \propto p(z_t \mid x_t), p(x_t \mid u_t, x_{t-1})

where:

In practice, methods like Adaptive Monte Carlo Localization (AMCL) are commonly used within Nav2.


Perception

Perception enables the robot to interpret sensor data to detect and classify objects. This includes:

Libraries such as OpenCV and Open3D are commonly used for image and point cloud processing.

Perception outputs are used by higher-level decision systems to guide actions such as navigation and manipulation.


equation example:

Cφ\oint_C \varphi
iψt=22m2ψ+Vψi\hbar \frac{\partial \psi}{\partial t} = -\frac{\hbar^2}{2m}\nabla^2\psi + V\psi

We can link to equations using their labels, like equation (4) or with more emphasis: eq 4. See the documentation for more options with using formulas. You might be interested in specific ways of numbering.