Wednesday, April 19, 2017

Self-driving Car ND A4, Advanced lane line, ending summary

Course 16, Project 4: Advanced Lane Finding
The following are my notes. Each section is labeled by the lesson number.

1 computer vision

Robotics can be broken down into a 3-step cycle:
  1. sense or perceive the world
  2. what to do based on the perception
  3. perform an action to carry out that decision
80% of the challenge of building a self-driving car is perception.
Use camera because it has better spatial resolution and much cheaper, although it lacks the 3D information that can ba captured by Lidar.

4 why correct image distortion?

  1. apparent size
  2. apparent shape
  3. appearance change depending on the position
  4. make object closer or farther than they actually are.
pinhole camera images are free from distortion, but lenses tend to introduce image distortion.

9 Finding corners

sudo pip install matplotlib --upgrade
image = cv2.imread(fname)  # bgr, (960, 1280, 3)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) #(960, 1280)
ret, corners = cv2.findChessboardCorners(gray, (nx, ny)) # bool, shape(48, 1, 2)
if ret == True:
    cv2.drawChessboardCorners(image, (nx, ny), corners, ret)

10 camera calibration

The basic workflow:
  1. take multiple pictures for the same image, e.g., chessboard, which serves as the ground truth. These ordered 3D points can be easily created by
    objp = np.zeros((6*8,3), np.float32)
    objp[:,:2] = np.mgrid[0:8, 0:6].T.reshape(-1,2)
  2. For each image, find corners by cv2.findChessboardCorners(gray, (nx, ny)) and append to imgpoints.
  3. use objpoints-imgpoints pairs, gray shape to extract calibrate parameters such as mtx (camera matrix, shape (3,3)) and dist (distortion coefficient, shape(1,5)).
  4. undistort image and compare them side by side.
ret, mtx, dist, rvecs, tvecs = cv2.calibrateCamera (objpoints, imgpoints, gray.shape,None,None)
undistort = cv2.undistort(original, mtx, dist, None, mtx)

17 perspective transform

The goal is to get lane curvature
import cv2
from sklearn.externals import joblib
f= "calibration_wide/wide_dist_pickle.p"
mtx, dist = joblib.load(f)
img = cv2.imread('calibration_wide/GOPR0070.jpg')
nx, ny =8, 6  
undist = cv2.undistort(img, mtx, dist, None, mtx)
ret, corners = cv2.findChessboardCorners(undist[:,:,0], (8,6), None) 
if ret == True:
    cv2.drawChessboardCorners(img, (nx, ny), corners, ret)
    src = np.float32([corners[0], corners[nx-1], corners[-1], corners[-nx]]) # clockwise
    dst = np.float32([[0, 0],[1280, 0], [1280, 960],[0, 960]]) 
    M = cv2.getPerspectiveTransform(src, dst)
    top_down = cv2.warpPerspective(undist, M, (1280,960))

20 get gradience by sobel operator

Canny Edge detection algorithm is great that we can find all the edges and lines with the help of Hough space. To filter out the annoying undesirable edges, we narrow our focus on gradient along the x-direction. This is what the sobel operation comes in.
  1. gradient of x: sobelx = cv2.Sobel(gray, cv2.CV_64F, 1, 0, ksize =3)
  2. gradient magnitude: gradmag = np.sqrt(sobelx**2 + sobely**2)
  3. direction of magnitude: np.arctan2(np.absolute(sobely), np.absolute(sobelx)) Direction alone is not particularly useful because there are directions everythere in every direction
  4. combine.
# get the derivative in the x direction denoted by 1,0
sobelx = cv2.Sobel(gray, cv2.CV_64F, 1, 0, ksize = 3) 
sobely = cv2.Sobel(gray, cv2.CV_64F, 0, 1)
# normalize into (0,255)
abs_sobelx = np.absolute(sobelx)
scaled_sobel = np.uint8(255*abs_sobelx/np.max(abs_sobelx)) 
# select pixels based on the x gradient strength
sxbinary = np.zeros_like(scaled_sobel)
sxbinary[(scaled_sobel >= 20) & (scaled_sobel <= 100)] = 1 
# gradient magnitude
gradmag = np.sqrt(sobelx**2 + sobely**2)
gradmag = np.zeros_like(scaled_sobel)
gradmag[(gradmag >= 0) & (gradmag <= 200)] = 1 
# direction of the gradient
absgraddir = np.arctan2(np.absolute(sobely), np.absolute(sobelx))
dir_binary =  np.zeros_like(absgraddir)
dir_binary[(absgraddir >= 0)&(absgraddir <= np.pi)] = 1
# combine the selection threshold
combined = np.zeros_like(dir_binary)
combined[((gradx == 1) & (grady == 1)) | ((mag_binary == 1) & (dir_binary == 1))] = 1

26 color space

RGB issues: B channel does not detect yellow lane line. All channels vary under different levels of brightness.
wiki of HLS and HSV color spaces here.
(Hue, Saturation, Value) and (Hue, Lightness, Saturation) are very similar.
Hue, the range of (0,179), represents color independent of any change in brightness.
Lightness and Value represent different ways to measure the relative lightness or darkness of a color. For example, a dark red will have a similar hue but much lower value for lightness than a light red.
Saturation is a measurement of colorfulness: the brighter a single color, the higher saturation value; closer to white is lower saturation value.
H and S channels stay fairly consistently in shadow or excessive brightness. S seems to detect lane line pretty well, as well as the dark section of H channel.

28 color threshold

Red: (200,255)
S: (90,255)
H: (15,100)
After some experiments, S is better for yellow lines, R is better for white lines.
  1. gradient x of lightness + raw value of saturation to create the binary image. threshold setting: (20,100), (170,255)

33 finding the lines

  1. np.copy() RGB image, and by setting the low-value pixels to zero, only the high-value pixels are preserved
  2. np.sum(img_cut[:,:,0], axis = 0) to get y intensity count over x-direction, such to estimate the lane line position by 2 peaks
  3. use 9 windows to narrow down the position of lane lines. In each loop, define a window by center and margin, identify nonzeros pixels, which will be used for fitting parabolic equation, use np.mean() to get new centers for current windows
  4. use np.concatenate() to merge points and use np.polyfit(y,x,2) to get coefficient of function x=f(y)
  5. use np.linspace(min,max,num) to generate y points and use fitted coefficient to get the parabolic curves.
  6. with initial fitting parameters at hand, search possible pixels in the neighborhood of fitting curves for subsequent images.
  7. alternatively, use initial center points, np.convolve, and np.argmax to get the new centers and windows. But this approach doesn’t collect enough points to fit parabolic curves.

34 sliding window search: convolution

  1. use np.sum and np.convolve to get the new centers for each window. This way is more mathematically and may be more dynamically robust.
  2. b,g,r = cv2.split(img) and img = cv2.merge((b,g,r)) to deal with different color channels.
  3. cv2.addWeighted() to add mask, this is equal to ax1+bx2

35 measure curvature

The radius of curvature (awesome tutorial here).
x=Ay^2+By+C, A has unit of inverse length
R_{curve}= \frac{(1+(2Ay+B)^2)^{3/2}}{∣2A∣}
def curverature(fit, y):
    A, B = fit[0], fit[1]
    return ((1+2*A*y+B)**2)**1.5/np.absolute(2*A)
# Define conversions in x and y from pixels space to meters
ym_per_pix = 30/720 # meters per pixel in y dimension
xm_per_pix = 3.7/700 # meters per pixel in x dimension

# Fit new polynomials to x,y in world space
left_fit_cr = np.polyfit(ploty*ym_per_pix, leftx*xm_per_pix, 2)
right_fit_cr = np.polyfit(ploty*ym_per_pix, rightx*xm_per_pix, 2)
# Calculate the new radii of curvature
left_curverad = ((1 + (2*left_fit_cr[0]*y_eval*ym_per_pix + left_fit_cr[1])**2)**1.5) / np.absolute(2*left_fit_cr[0])
right_curverad = ((1 + (2*right_fit_cr[0]*y_eval*ym_per_pix + right_fit_cr[1])**2)**1.5) / np.absolute(2*right_fit_cr[0])

36 project tips

  • camera calibration is a different setting: 9x6 chessboard. It is stored in “camera_cal” folder.
  • expect the curvation to be around 1 km
  • keep track of line base and curvatures from frame to frame.
  • smooth out by average
  • use cv2.warpPerspective(color_warp, Minv, image.shape[0:2]) to generate filled area that represent the found laneline and add to the original image
Produced video is uploaded to
I want to highlight 3 points:
  1. perspective transform needs some ground-truth facts to deal with the depth perception
  2. know when to use sobel operation and color space. For this project, Red channel is enough for the “not challenged” video
  3. know how to use a histogram to estimate the laneline base, use window search to collect qualified points, use initial fitting coefficient to quickly localize points for the subsequent images.

End of term 1 summary

First, in the 5 projects:
  • 2 projects are about finding lanelines, either a straight line or curved lines. These projects are about computer vision.
  • 2 projects are about identifying objects, either traffic signs or other vehicles. These projects are about deep learning.
  • 1 projects are about end-to-end learning. This is still deep learning.
Second, this nanodegree helps me demystify how an autonomous car works. Although there are some tricky bugs that bother me a lot, I learn a lot.
Thank you, Udacity and Sebastian!