Simultaneous Object Learning and Tracking (SOLT)

Contributor: Seongyong Koo

In cases of manipulating unknown and multiple objects in human environments, a robot should be able to identify and track individual objects in real-time. For a human, this can easily accomplished within several trials, e.g. infant's learning by observations or by interactions, but for a robot, it is still unexplored areas due to the many uncertainties of the structures and dynamics of the target objects. For example, an object shape can be flexible and have articulated parts such as a human hand, and the number of objects to track can also be varying. In addition, dynamic movements of multiple objects cause various interaction cases between objects such as object separation, partial or complete occlusions, and multiple object contacts, as shown in Fig. 1. Without prior object models, these situations distort the observed point data of each object, thus reducing the robustness of the tracking performance.

With the advent of RGB-D cameras and improvements to point cloud data processing technologies, the observed environment can be represented as point set data, wherein each point contains not only the RGB color but also 3-d position information. The multiple object tracking problem from the point set data involves identifying each point data to each true object track, at each time. In order to solve this problem without any prior knowledge, this research proposes a framework of simultaneous object learning and tracking (SOLT) for multiple moving objects from RGB-D point set data, as illustrated in Fig. 2. In this framework, each object model is incrementally updated at each time from the identified point data, which are feedback results of the robust tracking process based on the previously constructed object model.

This approach involves three main problems: 1. how to represent arbitrary objects flexibly and robustly, 2. how to segment and track individual object points from the sensor data, and 3. how to update the model with the newly observed data. The research aims to study recent machine learning and computer vision techniques to implement real-time and high performance tracking of the model-free multiple objects. This process will be applied to humanoid robots for learning unknown object manipulation tasks by observing human demonstrations or by interacting with unknown objects by robot itself.


Source code

Related Publications

Journal articles

  1. Seongyong Koo, Dongheui Lee, and Dong-Soo Kwon, “Incremental object learning and robust tracking of multiple objects from RGB-D point set data”, Journal of Visual Communication and Image Representation (JVCI), Vol. 25, No. 1, pp. 108-121, 2014.

Peer-reviewed conference papers

  1. Seongyong Koo, Dongheui Lee, and Dong-Soo Kwon, “Unsupervised object individuation from RGB-D image sequences”, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2014) at Chicago, IL, USA, Nov. 2014, pp. 4450-4457.
  2. Seongyong Koo, Dongheui Lee, and Dong-Soo Kwon, “Multiple Object Tracking Using an RGB-D Camera by Hierarchical Spatiotemporal Data Association”, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2013) at Tokyo, Japan, Nov. 2013, pp. 1113-1118.
  3. Seongyong Koo, Dongheui Lee, and Dong-Soo Kwon, “GMM-based 3D Object Representation and Robust Tracking in Unconstructed Dynamic Environments”, 2013 IEEE International Conference on Robotics and Automation (ICRA 2013) at Karlsruhe, Germany, May 2013, pp. 1106-1113.