Course 16, Project 4: Advanced Lane Finding
The following are my notes. Each section is labeled by the lesson number.
1 computer vision
Robotics can be broken down into a 3-step cycle:
- sense or perceive the world
- what to do based on the perception
- perform an action to carry out that decision
80% of the challenge of building a self-driving car is perception.
Use camera because it has better spatial resolution and much cheaper, although it lacks the 3D information that can ba captured by Lidar.
4 why correct image distortion?
- apparent size
- apparent shape
- appearance change depending on the position
- make object closer or farther than they actually are.
pinhole camera images are free from distortion, but lenses tend to introduce image distortion.
9 Finding corners
sudo pip install matplotlib --upgrade
image = cv2.imread(fname) # bgr, (960, 1280, 3) gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) #(960, 1280) ret, corners = cv2.findChessboardCorners(gray, (nx, ny)) # bool, shape(48, 1, 2) if ret == True: cv2.drawChessboardCorners(image, (nx, ny), corners, ret) plt.imshow(img) plt.show()
10 camera calibration
Course material: https://github.com/udacity/CarND-Camera-Calibration
The basic workflow:
- take multiple pictures for the same image, e.g., chessboard, which serves as the ground truth. These ordered 3D points can be easily created by
objp = np.zeros((6*8,3), np.float32) objp[:,:2] = np.mgrid[0:8, 0:6].T.reshape(-1,2)
- For each image, find corners by
cv2.findChessboardCorners(gray, (nx, ny))and append to imgpoints.
- use objpoints-imgpoints pairs, gray shape to extract calibrate parameters such as mtx (camera matrix, shape (3,3)) and dist (distortion coefficient, shape(1,5)).
- undistort image and compare them side by side.
ret, mtx, dist, rvecs, tvecs = cv2.calibrateCamera (objpoints, imgpoints, gray.shape,None,None) undistort = cv2.undistort(original, mtx, dist, None, mtx)
17 perspective transform
The goal is to get lane curvature
import cv2 from sklearn.externals import joblib f= "calibration_wide/wide_dist_pickle.p" mtx, dist = joblib.load(f) img = cv2.imread('calibration_wide/GOPR0070.jpg') nx, ny =8, 6 undist = cv2.undistort(img, mtx, dist, None, mtx) ret, corners = cv2.findChessboardCorners(undist[:,:,0], (8,6), None) if ret == True: cv2.drawChessboardCorners(img, (nx, ny), corners, ret) src = np.float32([corners, corners[nx-1], corners[-1], corners[-nx]]) # clockwise dst = np.float32([[0, 0],[1280, 0], [1280, 960],[0, 960]]) M = cv2.getPerspectiveTransform(src, dst) top_down = cv2.warpPerspective(undist, M, (1280,960))
20 get gradience by sobel operator
Canny Edge detection algorithm is great that we can find all the edges and lines with the help of Hough space. To filter out the annoying undesirable edges, we narrow our focus on gradient along the x-direction. This is what the sobel operation comes in.
- gradient of x:
sobelx = cv2.Sobel(gray, cv2.CV_64F, 1, 0, ksize =3)
- gradient magnitude:
gradmag = np.sqrt(sobelx**2 + sobely**2)
- direction of magnitude:
np.arctan2(np.absolute(sobely), np.absolute(sobelx))Direction alone is not particularly useful because there are directions everythere in every direction
# get the derivative in the x direction denoted by 1,0 sobelx = cv2.Sobel(gray, cv2.CV_64F, 1, 0, ksize = 3) sobely = cv2.Sobel(gray, cv2.CV_64F, 0, 1) # normalize into (0,255) abs_sobelx = np.absolute(sobelx) scaled_sobel = np.uint8(255*abs_sobelx/np.max(abs_sobelx)) # select pixels based on the x gradient strength sxbinary = np.zeros_like(scaled_sobel) sxbinary[(scaled_sobel >= 20) & (scaled_sobel <= 100)] = 1 # gradient magnitude gradmag = np.sqrt(sobelx**2 + sobely**2) gradmag = np.zeros_like(scaled_sobel) gradmag[(gradmag >= 0) & (gradmag <= 200)] = 1 # direction of the gradient absgraddir = np.arctan2(np.absolute(sobely), np.absolute(sobelx)) dir_binary = np.zeros_like(absgraddir) dir_binary[(absgraddir >= 0)&(absgraddir <= np.pi)] = 1 # combine the selection threshold combined = np.zeros_like(dir_binary) combined[((gradx == 1) & (grady == 1)) | ((mag_binary == 1) & (dir_binary == 1))] = 1
26 color space
RGB issues: B channel does not detect yellow lane line. All channels vary under different levels of brightness.
wiki of HLS and HSV color spaces here.
(Hue, Saturation, Value) and (Hue, Lightness, Saturation) are very similar.
Hue, the range of (0,179), represents color independent of any change in brightness.
Lightness and Value represent different ways to measure the relative lightness or darkness of a color. For example, a dark red will have a similar hue but much lower value for lightness than a light red.
Saturation is a measurement of colorfulness: the brighter a single color, the higher saturation value; closer to white is lower saturation value.
H and S channels stay fairly consistently in shadow or excessive brightness. S seems to detect lane line pretty well, as well as the dark section of H channel.
28 color threshold
After some experiments, S is better for yellow lines, R is better for white lines.
- gradient x of lightness + raw value of saturation to create the binary image. threshold setting: (20,100), (170,255)
33 finding the lines
np.copy()RGB image, and by setting the low-value pixels to zero, only the high-value pixels are preserved
np.sum(img_cut[:,:,0], axis = 0)to get y intensity count over x-direction, such to estimate the lane line position by 2 peaks
- use 9 windows to narrow down the position of lane lines. In each loop, define a window by center and margin, identify nonzeros pixels, which will be used for fitting parabolic equation, use
np.mean()to get new centers for current windows
np.concatenate()to merge points and use
np.polyfit(y,x,2)to get coefficient of function x=f(y)
np.linspace(min,max,num)to generate y points and use fitted coefficient to get the parabolic curves.
- with initial fitting parameters at hand, search possible pixels in the neighborhood of fitting curves for subsequent images.
- alternatively, use initial center points,
np.argmaxto get the new centers and windows. But this approach doesn’t collect enough points to fit parabolic curves.
34 sliding window search: convolution
np.convolveto get the new centers for each window. This way is more mathematically and may be more dynamically robust.
b,g,r = cv2.split(img)and
img = cv2.merge((b,g,r))to deal with different color channels.
cv2.addWeighted()to add mask, this is equal to ax1+bx2
35 measure curvature
The radius of curvature (awesome tutorial here).
, A has unit of inverse length
def curverature(fit, y): A, B = fit, fit return ((1+2*A*y+B)**2)**1.5/np.absolute(2*A) # Define conversions in x and y from pixels space to meters ym_per_pix = 30/720 # meters per pixel in y dimension xm_per_pix = 3.7/700 # meters per pixel in x dimension # Fit new polynomials to x,y in world space left_fit_cr = np.polyfit(ploty*ym_per_pix, leftx*xm_per_pix, 2) right_fit_cr = np.polyfit(ploty*ym_per_pix, rightx*xm_per_pix, 2) # Calculate the new radii of curvature left_curverad = ((1 + (2*left_fit_cr*y_eval*ym_per_pix + left_fit_cr)**2)**1.5) / np.absolute(2*left_fit_cr) right_curverad = ((1 + (2*right_fit_cr*y_eval*ym_per_pix + right_fit_cr)**2)**1.5) / np.absolute(2*right_fit_cr) #
36 project tips
- camera calibration is a different setting: 9x6 chessboard. It is stored in “camera_cal” folder.
- expect the curvation to be around 1 km
- keep track of line base and curvatures from frame to frame.
- smooth out by average
cv2.warpPerspective(color_warp, Minv, image.shape[0:2])to generate filled area that represent the found laneline and add to the original image
My implementation: https://github.com/jychstar/NanoDegreeProject
Produced video is uploaded to https://youtu.be/xZK199K9jwk
I want to highlight 3 points:
- perspective transform needs some ground-truth facts to deal with the depth perception
- know when to use sobel operation and color space. For this project, Red channel is enough for the “not challenged” video
- know how to use a histogram to estimate the laneline base, use window search to collect qualified points, use initial fitting coefficient to quickly localize points for the subsequent images.
End of term 1 summary
First, in the 5 projects:
- 2 projects are about finding lanelines, either a straight line or curved lines. These projects are about computer vision.
- 2 projects are about identifying objects, either traffic signs or other vehicles. These projects are about deep learning.
- 1 projects are about end-to-end learning. This is still deep learning.
Second, this nanodegree helps me demystify how an autonomous car works. Although there are some tricky bugs that bother me a lot, I learn a lot.
Thank you, Udacity and Sebastian!