Blog | 03-11-2022

Technology Trinity: improved robot reliability with smart computer vision software

written by: Diogo Martins

To ensure the reliable operation of our pick & place solutions, we have developed the Technology Trinity at Smart Robotics. The Technology Trinity refers to our unique combination of Vision, Motion and Task Planning algorithms that allow our pick & place robots to accurately handle a large variety of items and to continuously adapt and improve efficiency. In this blog series, we will dive into the three pillars of our Technology Trinity. This first blog will focus on Vision and how our computer vision software improves the accuracy and reliability of our robots.

What is computer vision?

Computer vision extracts and analyzes information from an image or sequence of images, just like the human visual system does. The goal of computer vision within robotics is to understand the environment using digital images or videos, to enable the robot to perform tasks such as the picking and placing of items.

Why do pick & place robots use computer vision?

At Smart Robotics we use computer vision to teach our pick & place robots the answer to questions such as:

  • Where can the system pick from?
  • What can be picked?
  • What is the best location to pick an item?
  • What item is the robot holding?
  • What is the best location to place an item?

To be able to teach the robot the answers to these questions, we have implemented special 3D cameras in our pick & place systems. They capture images that help the robot create an understanding of its environment. Our vision algorithms then enable the robot to perform its pick & place task accurately. Taking our Smart Item Picker as an example, the following steps are executed during every pick & place action to ensure gentle and extremely reliable item handling.

Step 1: image acquisition

Once the robot receives a pick request, a corresponding pick action is triggered and an image capture request is sent to each 3D camera to create an image.

Step 2: tote detection

From the depth image retrieved by the 3D camera our algorithm detects the pick tote. This information is used to update the ‘world model’ of the robot. It ensures the robot knows the exact position of the pick tote to prevent colliding with the tote when performing a pick-action.

Step 3: item detection

When the pick tote is detected, the robot needs to determine where the items are located in the tote. Here, computer vision is combined with deep learning algorithms that process the image and create an understanding of how many items are in the tote, as well as the precise position of the items. With that information, the robot updates its collision model and knows how to pick the items without colliding. In addition, the deep learning algorithms determine the type of item e.g. whether it is cardboard, paper, hard or soft plastic for example, enabling the robot to choose the correct type of suction cup(s) to pick that material.

Step 4: grasp pose determination

Once the robot knows the position and material of items in the tote, it determines the ideal grasp pose based on the exact orientation of the items. For example, an item might be in a slanted position, resulting in the robot needing to tilt its gripper to place its suction cup(s) on the correct surface.

Step 5: item verification

After the item is picked, the robot moves up and over a 3D camera facing upwards that captures a new image of the item the robot is holding. But why is this necessary? As there are usually multiple items in a pick tote, some items may be positioned on top of others or partly covering other items. Hence, the robot is not yet fully aware of the exact dimensions of all items. Therefore, a new image of the picked item is needed to learn the exact dimensions of an item.

Step 6: item placement

Now that the exact dimensions of the picked item are known, the robot can determine the optimal place position in a tote or on a belt. When an item has to be placed inside a tote, a new 3D image will be created detecting the tote and the items that may be within. A specially developed stacking algorithm then determines the optimal place position and calculates the distance to this position to ensure the item can be placed gently. As for placing an item on a moving belt, the robot will move to a fixed, predetermined position and then place the item.

How our Vision improves robot reliability.

Our Vision algorithms, in combination with our Motion & Task Planning algorithms, enable our pick & place robots to detect an immense variety of items and determine the best way to pick and place them. This brings several advantages:

  • There is no need for SKU teaching as the robot extracts the item information from the 3D images.
  • The robot learns from each 3D image it analyzes, bringing about better and quicker item detection resulting in improved performance and higher reliability
  • Finally, As the robot is constantly up to date on its environment, the position of the pick & place totes, as well as the dimensions of an item it is holding, there is very little risk of collision resulting in gentle item handling and high-performance reliability.

And that is what every warehouse needs for its fulfillment process.