Update: At Apple’s launch event on 2017.9.12, iPhone X used RGBD camera in the front panel for FaceID.

In term 2, we’ll be sending RGBD camera as part of the hardware package.

An example of semantic segmentation (per-pixel object classification) in the Cityscapes Dataset. See Cordts et al., 2016 for details.

In robotics, we face real world limitations when it comes to things like power consumption, weight, volume and other physical parameters of the system. The Google Tensor Processing Unit (TPU) and the NVIDIA Drive PX2 are examples of some of the cutting edge compute hardware driving onboard processing in robotics to a new level.

3D sensor

Active sensors:

Lidar
Time of flight camera: illuminating an area with an infrared light source. It captures entire field of view without any moving parts.. Based on the working principle, ToF sensors have 2 categories: pulse runtime and phase shift continuous wave.
Ultrasonic sensor: sends out a high-frequnecy sound pulse. It has 2 piezoelectric crystals for transmitter and receiver.

Passive sensors:

Monocular camera: reconstruct a 3D model from the video stream using technique called Structure from Motion. Depth is calculated via triangulation technique, which requires accurate measurement of camera pose throughout its movement. Check out this awesome paper by Saxena, Chung and Ng 2007 for the details.
Stereo camera: two monocular cameras separated by an accurately known distance. Depth information is obtained by comparing image frames obtained from both cameras viewing the same object or scene. The difference between position of a given object in the scene as perceived by the two cameras is called disparity.

Hybrid:

RGB-D camera: similar to TOF camera, but the return wave modulation is performed spatially instead of temporally. The predefined patterns range from simple stripes to unique and convoluted speckle patterns. Unlike stereo cameras, they save a lot of computational resources by providing per-pixel depth values directly instead of inferring the depth information from raw image frames. It is low cost, short range and low resolution.

Most RGB-D cameras use a technique called Structured Light to obtain depth information from a scene.

By comparing between the known projected pattern and the reflected pattern, a depth map is generated much like a stereo camera.

calibration, filtering, segmentation

With 3D point cloud, the majority of the data actually corresponds to unwanted background objects or other things that must be filtered out.

No sensor is perfect. Calibration at its core is the process of comparing 2 measurements, a ground truth and a fresh measurement.

While python has OpenCV library to calibrate camera, ROS also has its own camera_calibration package that allows easy calibration of Monocular, Stereo, and RGB-D cameras.Check out this tutorial.

install python-pcl

python-pcl is a library of Python bindings for the powerful Point Cloud Library.

instruction: https://github.com/udacity/RoboND-Perception-Exercises

$ sudo pip install cython
$ cd ~/RoboND-Perception-Exercises/python-pcl
$ python setup.py build
$ sudo python setup.py install

Note the you can’t pip install pcl because the library has a lot of dependency of local environment. So you have to compile the library locally. What’s more, the RAM for VM should be at least 4GB.

if install locally in Mac,

brew install pcl
brew link --overwrite pcl
# install various other dependancies
pip install .

However, fatal error: 'pcl/point_cloud.h' file not found seems due to that I don’t have an ROS environment.

point cloud data format

Point cloud library was first released on 2010.3.

A typical .pcd file looks like this:

# .PCD v0.7 - Point Cloud Data file format
VERSION 0.7
FIELDS x y z rgb
SIZE 4 4 4 4
TYPE F F F F
COUNT 1 1 1 1
WIDTH 202627
HEIGHT 1
VIEWPOINT 0 0 0 1 0 0 0
POINTS 202627
DATA ascii
0.038113248 -1.6091453 1.0870478 1.0624617e-39
0.035177067 -1.6083764 1.0873107 1.0624617e-39

Note:

latest version is 1.7.2 in 2014

fields can be several types:

FIELDS x y z       # XYZ data
FIELDS x y z rgb   # XYZ + colors
FIELDS x y z normal_x normal_y normal_z     # XYZ + surface normals
FIELDS j1 j2 j3    # moment invariants

size: size in bytes
type: I for int8(char), in16(short), int32(int), u for unsigned, f for float
for unordered point cloud, width is the number of points and height is set to 1.
view point: used for building transforms between different coordinate systems. translation(tx,ty,tz) + quaternion(qw,qx,qy,qz), default is (0,0,0,1,0,0,0)
points: number of points
data: storage type: ascii or binary. advantage of ascii is simplicity and speed

PointCloud vs PointCloud2

https://answers.ros.org/question/50272/standard-for-pointcloud-messages-is-still-pointcloud2/?answer=50303#post-id-50303

The sensor_msgs/PointCloud2 format was designed as a ROS message, and is the preferred choice for ROS applications.

The pcl/PointCloud<T> format represents the internal PCL point cloud format. For modularity and efficiency reasons, the format is templated on the point type, and PCL provides a list of templated common types which are SSE aligned.

$ rosmsg info sensor_msgs/PointCloud
$ rosmsg info sensor_msgs/PointCloud2

We can see PointCloud2 has much more meta-information for inter-communication.

Exercise-1: point cloud filtering

input: pcl.load_XYZRGB('tabletop.pcd')

output: pcl.save(cloud_new,"xx.pcd"), pcl_viewer xx.pcd

Most commonly used filters from the Point Cloud Library:

VoxelGrid Downsampling Filter
ExtractIndices Filter
PassThrough Filter
RANSAC Plane Fitting: Random Sample Consensus. estimate what pieces of the point cloud set belong to that shape by assuming a particular model, such as a plane.The number of iterations is a trade-off between compuer time vs. model detection accuracy.
Outlier Removal Filter

Depth camera point cloud is very dense. To increase the processing speed without sacrificing results, we downsample the data by Voxel Grid Downsampling Filter.

Pixel: picture element. Voxel: Volume element.

The majority of the data is not useful for identifying the target (the hammer). Things like ground, table and other objects do not aid in recognizing your target and can be regarded as noise for the most part. Processing the complete point cloud including this excessive data is inefficient and leads to wastage of compute cycles.

RANSAC.py is used to process the PCD (point cloud data). visualize the pcd by pcl_viewer filename.pcd

Basic operations for pcd are implemented in RANSAC.py:

import pcl
cloud = pcl.load_XYZRGB('tabletop.pcd') # <type 'pcl._pcl.PointCloud_PointXYZRGB'>
## downsizing filter
vox = cloud.make_voxel_grid_fiter() # <type pcl._pcl.VoxelGridFilter_PointXYZRGB'>
leaf_size = 0.1
vox.set_leaf_size(leaf_size,leaf_size,leaf_size)
cloud_new = vox.filter() # <type 'pcl._pcl.PointCloud_PointXYZRGB'>
pcl.save(cloud_new,"cloud_name.pcd")
# PassThrough filter
passthrough = cloud.make_passthrough_filter()
filter_axis = 'z'
passthrough.set_filter_field_name(filter_axis)
axis_min = 0
axis_max = 2
passthrough.set_filter_limits(axis_min, axis_max)
cloud_new = passthrough.filter()
# segmenter filter: RANSAC
seg = cloud.make_segmenter()
seg.set_model_type(pcl.SACMODEL_PLANE)
seg.set_method_type(pcl.SAC_RANSAC)
max_distance = 1
seg.set_distance_threshold(max_distance)
inliers, coefficients = seg.segment()
extract_in= cloud.extract(inliers, negative=False)
extract_out= cloud.extract(inliers, negative=True)
# Guassian noise filter, Any point with a mean distance larger than global (mean distance+x*std_dev) will be considered outlier
outlier = cloud.make_statistical_outlier_filter()
outlier.set_mean_k(50)
x = 1.0
outlier.set_std_dev_mul_thresh(x)
cloud_new = outlier.filter()

Excercise 2: clustering for segmentation

input: camera data from a ROS node by subscribing to a topic called /sensor_stick/point_cloud.

output: after point cloud is segmented by Euclidean clustering, publish data back to topics called /pcl_table and /pcl_objects

setup:

$ cp -r ~/RoboND-Perception-Exercises/Exercise-2/sensor_stick ~/catkin_ws/src/
$ cd ~/catkin_ws
$ rosdep install --from-paths src --ignore-src --rosdistro=kinetic -y
$ catkin_make
# add following to .bashrc file
# export GAZEBO_MODEL_PATH=~/catkin_ws/src/sensor_stick/models
# source ~/catkin_ws/devel/setup.bash
$ roslaunch sensor_stick robot_spawn.launch
# what it actually launch: robot_description.launch, robot_control.launch,empty_world.launch,perception.rviz
$ cd ~/catkin_ws/src/sensor_stick/scripts/
$ ls
# pcl_helper.py template.py
$ cp template.py segmentation.py  # make a duplicate
$ ./segmentation.py

clustering tries to anwser the simple question, which components of a dataset naturally belong together?

k-means clustering: an interative, unsupervised learning algorithm. Begin by randomly selecting k initial centroids, then create k clusters by associate each data points with the nearest centroid. However, it may fail due to highly overlapping, complex shapes or highly sensitive to random initialization.
Euclidean clustering, or DBSCAN (Density-Based Spatial Clustering of Application with Noise). You may not know how many clusters to expect, but you do know something about how the points should be clustered (min_samples, max_dist). Each cluster has 3 levels of members: core member, edge member, outlier.

publish point cloud to new topics

In sensor_stick/scripts/template.py, duplicate file to segmentation.py

from pcl_helper import *
rospy.init_node('clustering', anonymous=True)
pcl_sub = rospy.Subscriber("/sensor_stick/point_cloud", pc2.PointCloud2, pcl_callback, queue_size=1)
pcl_objects_pub = rospy.Publisher("/pcl_objects", PointCloud2, queue_size=1)
pcl_table_pub = rospy.Publisher("/pcl_table", PointCloud2, queue_size=1)
while not rospy.is_shutdown():
 rospy.spin()
def pcl_callback(pcl_msg):
  cloud = ros_to_pcl(pcl_msg)
  # process cloud data as last excercise
  # Euclidean clustering as in next section
  ros_cloud_object = pcl_to_ros(cloud_data)
  pcl_object_pub.publish(ros_cloud_object)

After executing ./segmentation.py, you should see your new topics pcl_objects and pcl_tablebeing published in RViz.

http://wiki.ros.org/ROS/Tutorials

A typical type ros message is sensor_msgs.msg.PointCloud2, which has a lot of metadata such as header,height, width, fields, is_bigendian,point_step, row_step. But the major part is stored in data, which is a list of string, each string is packed by struct with up to 11 field.

Euclidean Clustering:

KdTree to find the K nearest neighbors: http://pointclouds.org/documentation/tutorials/kdtree_search.php

# Euclidean Clustering
white_cloud = XYZRGB_to_XYZ(extracted_outliers) # <type 'pcl._pcl.PointCloud'>
tree = white_cloud.make_kdtree() # <type 'pcl._pcl.KdTree'>
ec = white_cloud.make_EuclideanClusterExtraction()
ec.set_ClusterTolerance(0.02) # for hammer
ec.set_MinClusterSize(10)
ec.set_MaxClusterSize(250)
ec.set_SearchMethod(tree)
cluster_indices = ec.Extract() # indices for each cluster (a list of lists)
# Assign a color to each cluster
cluster_color = get_color_list(len(cluster_indices))
color_cluster_point_list = []
for j, indices in enumerate(cluster_indices):
  for i, indice in enumerate(indices):             
  color_cluster_point_list.append(
[white_cloud[indice][0], white_cloud[indice][1],  white_cloud[indice][2], rgb_to_float(cluster_color[j])])
# Create new cloud containing all clusters, each with unique color
cluster_cloud = pcl.PointCloud_PointXYZRGB()
cluster_cloud.from_list(color_cluster_point_list)
# publish to cloud
ros_cluster_cloud = pcl_to_ros(cluster_cloud)
# save to local
pcl.save(cluster_cloud, 'cluster.pcd')

Note:

https://udacity-robotics.slack.com/files/U4P159KML/F6Q8XN3J8/error_with_udacity_provided_function_get_color_list.py

colorlist is a static variable you should define it outside the function getcolorlist, I defined it after if _name == ‘__main‘: get_color_list.color_list = []

exercise 3: object recognizer

input: cloud data from ROS

output: clustered cloud with recognized label

Possible Features for identifying an object:

object color and shape
detailed color information
detailed shape information

color histogram

import numpy as np
import cv2
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
image = mpimg.imread('Udacican.jpeg')    
# Define a function to compute color histogram features  
def color_hist(img, nbins=32, bins_range=(0, 256)):
    hsv_img = cv2.cvtColor(img, cv2.COLOR_RGB2HSV)
    h_hist = np.histogram(hsv_img[:,:,0], bins=nbins, range=bins_range)
    s_hist = np.histogram(hsv_img[:,:,1], bins=nbins, range=bins_range)
    v_hist = np.histogram(hsv_img[:,:,2], bins=nbins, range=bins_range)
    # Concatenate the histograms into a single feature vector
    hist_features = np.concatenate((h_hist[0], s_hist[0], v_hist[0])).astype(np.float64)
    norm_features = hist_features / np.sum(hist_features)
    return norm_features   
feature_vec = color_hist(image, nbins=32, bins_range=(0, 256))

setup:

warning: because the folder name “sensor_stick” is the same with that in exercise 2, it’s best to back up your old folder and cp -r new folder

$ cd RoboND-Perception-Exercises/Exercise-3
$ cp -r sensor_stick/ ~/catkin_ws/src/
$ cd ~/catkin_ws
$ rosdep install --from-paths src --ignore-src --rosdistro=kinetic -y
$ catkin_make
$ roslaunch sensor_stick training.launch
# actual: robot_description.launch, robot_control.launch, empty_world.launch, cloud_transformer, feature_extractor
$ cd ~/catkin_ws/src/sensor_stick/scripts
$ ls
# capture_features.py, template.py, train_svm.py
$ rosrun sensor_stick capture_features.py
# generate color_histogram and normal_histogram for 7 classes, pickle.dump into "training_set.sav" at current dir
$ rosrun sensor_stick train_svm.py  
# load dataset, StandardScaler() for X, LabelEncoder for y, KFold cross validation for score, plot confusion marix, use whole data to train again and pickle.dump to 'model.sav'
# this code can be run independent of ros

data generation: `capture_feature.py`

use spawn_model() and capture_sample() from sensor_stick/training_helper.py to generate ROS PointCloud2.
ROS PointCloud2 data will be parsed by pc2.read_points.use compute_color_histograms() and compute_normal_histograms() from sensor_stick/features.py to divide each color channel and each normal direction into 32 bins, so the total features are 192. The exact number of RGB_D points are not important, because histogram measures the distribution of values, or the relative number.
concantenate 192 features into one sample. There are 7 classes, each is repeated 5 times, so the total training set is (35,192).
pickle.dump(labeled_features, open('training_set.sav', 'wb'))

improve features.py :

in the for loop that begins with for i in range(5):. Increase this value to increase the number of times you capture features for each object.
To use HSV, find the line in capture_features.py where you’re calling compute_color_histograms() and change the flag to using_hsv=True.
try different binning schemes.

machine learning: `train_svm.py`

training_set = pickle.load(open('training_set.sav', 'rb'))
preprocess: StandardScaler() for X, LabelEncoder for y,
KFold cross validation for score, plot confusion marix
use whole data to train again and get the final model
model = {'classifier': clf, 'classes': encoder.classes_, 'scaler': X_scaler}
pickle.dump to 'model.sav'
If you get the 192 features and want to make a prediction, first you need to reformat the data with 3 operations : X_test = X_scaler.transform(np.array(features).reshape(1,192)), then you can clf.predict(X_test)

put them all together

$ cd ~/catkin_ws/src/sensor_stick/scripts/
$ cp template.py object_recognition.py
# finish all TO-DO
$ roslaunch sensor_stick robot_spawn.launch
$ chmod +x object_recognition.py
$ ./object_recognition.py

project

Amazon robotics challenge, 2017.7.30: https://www.amazonrobotics.com/#/roboticschallenge/items

project: https://github.com/udacity/RoboND-Perception-Project

rubric: https://review.udacity.com/#!/rubrics/1067/view

demo:

$ cd ~/catkin_ws/src/RoboND-Perception-Project/pr2_robot/scripts
$ chmod u+x pr2_safe_spawner.sh
$ ./pr2_safe_spawner.sh # what it actually does:
# roslaunch pr2_robot pick_place_demo.launch
# roslaunch pr2_moveit pr2_moveit.launch
# rosrun pr2_robot pr2_motion

major window + cllision map
click on the Next button on botton left of RViz window
robots starts in an idle state
add 2 dropbox
raise 2 arms
rotate in plane
identify 3 tabletop objects, plan pick-place trajectories

test your code:

#compare pick_list: (the model folder has 12 models)
models = ['beer', 'bowl', 'create', 'disk_part', 'hammer', 'plastic_cup','soda_can']  # original
models = ['biscuits', 'soap', 'soap2'] # pick_list_1.yaml
models = ['biscuits', 'soap', 'book', 'soap2', 'glue'] # pick_list_2.yaml
models = ['sticky_notes', 'book','snacks', 'soap', 'biscuits','eraser', 'soap2', 'glue'] # pick_list_3.yaml
$ cd ~/catkin_ws
$ roslaunch sensor_stick training.launch
# change the 'models' list, output file name "training_set.sav"
$ rosrun sensor_stick capture_features.py
# change load file name, output file name  'model.sav'
$ rosrun sensor_stick train_svm.py  
# change  "test.world", "pick_list" to corresponding numer
$ roslaunch pr2_robot pick_place_project.launch

# change load file name, output file name "output_1.yaml", test_scene_num
$ rosrun pr2_robot project_template.py

3 scenarios to work with: .worldfiles locate at /pr2_robot/worlds/. Corresponding pick lists are .yaml files at /pr2_robot/config.

Execution plan:

according to the pick_list, modify models list in the capture_features.py
generate training set
train SVM mode
add trained mode to perception pipeline

output to yaml file

The task is to write a ros_client to send a request to the pr2_pick_place_server. The file srv/PickPlace.srv defines the request/response variables in terms of message type. However, each message type is an object with attributes such as data. Check details by rosmsg info xx

from std_msgs.msg import Int32
test_scene_num = Int32()
test_scene_num.data = 1

from std_msgs.msg import String
object_name = String()
object_name.data = object_list_param[i]['name']

from geometry_msgs.msg import Pose
pick_pose = Pose()
pick_pose.position = xx # Point: x,y,z
pick_pose.orientation = xx # Quaternion: x,y,z,w

create a .yaml output file. The way it will actually work within your code is that you’ll iterate over each item in the pick list, see whether you found it in your perception analysis, then if you did, populate the pick_pose message with the centroid. Since you’ll do this with each object one at a time, a convenient way to save the messages for each object is in a list of dictionaries by make_yaml_dict()

PR2 collision avoidance: change config/sensors.yaml‘s point_cloud_topic from /pr2/voxel_grid/points to /pr2/3d_map/points.

publishing joint_angle values directly to the world_joint_controller.

For detailed implementation, see: https://github.com/jychstar/NanoDegreeProject/tree/master/RoboND/p3_perception

Yuchao's blogspot

Saturday, September 16, 2017

Robot ND A3, Perception

3D sensor

calibration, filtering, segmentation

install python-pcl

point cloud data format

PointCloud vs PointCloud2

Exercise-1: point cloud filtering

Excercise 2: clustering for segmentation

publish point cloud to new topics

Euclidean Clustering:

exercise 3: object recognizer

setup:

data generation: `capture_feature.py`

machine learning: `train_svm.py`

put them all together

project

output to yaml file

No comments:

Post a Comment

Saturday, September 16, 2017

Robot ND A3, Perception

3D sensor

calibration, filtering, segmentation

install python-pcl

point cloud data format

PointCloud vs PointCloud2

Exercise-1: point cloud filtering

Excercise 2: clustering for segmentation

publish point cloud to new topics

Euclidean Clustering:

exercise 3: object recognizer

setup:

data generation: capture_feature.py

machine learning: train_svm.py

put them all together

project

output to yaml file

No comments:

Post a Comment

data generation: `capture_feature.py`

machine learning: `train_svm.py`