Tuesday, August 30, 2016

Take-home thought

I only learned R for about 12 hours and finished half of this Udactiy course. I get a quick feeling of what R really is.

For me, R is very similar to Matlab, but more focus and highly specialized on data visualization. The console is very powerful, you can install package in console, like a terminal.

R has many built-in functions specialized for statistics. So it is very handy to get values like median, mean, correlation, deviation.

RStudio is a very nice IDE. It allows Rmd, similar to ipython notebook.

However, the syntax of R is highly specialized for certain drawing. And there are some syntax changes for ggplot2. I could pick up these details quickly later if I have to.

basics

constructs

concepts difficult to define and measure:

Memory
Happiness
Guilt
Love

operational definition

depression: http://www.hr.ucdavis.edu/asap/pdf_files/Beck_Depression_Inventory.pdf

anger: number of profanities uttered per min

happiness: ratio of minutes spent smiling to minutes not smiling

what’s EDA

Initial data analysis

check assumptions required for model fitting and hypothesis testing and handling missing values and making transformation of variables.

Exploratory data analysis

summarize their main characteristics, generate better hypothesis, determine which variables have the most predictive power, and select appropriate Statistical tools

Develop a mindset of curious and skeptical.

install R: https://cran.rstudio.com/

install Studio: https://www.rstudio.com/

Swirl (statistics with interactive R learning). Swirl is a software package for the R statistical programming language. Its purpose is to teach statistics and R commands interactively.

Type the following commands in the Console, pressing Enter or Return after each line:
install.packages("swirl")
library(swirl)
swirl()

package

textcat

ggplot2

learning sources

http://www.statmethods.net/

https://www.r-bloggers.com/

http://stackoverflow.com/tags/r/info

https://google.github.io/styleguide/Rguide.xml

basic command

ctrl+ L: clear the console

students <- c(“John”,”Kate”) # assignment, vector, 1-based, chr instead of string

numbers <- c(1:10)

numberOfChar = nchar(students)

data(mtcars) # load built-in data mtcars

names(mtcars)

str(mtcars)

dim(mtcars)

getwd()

setwd(“/Users/yuchaojiang/Downloads/EDA_Course_Materials/lesson2”)

statesInfo <- read.csv(“stateData.csv”)

stateSubset <-subset(statesInfo, state.region==1)

stateSubset <- statesInfo[statesInfo$state.region==1,] # equavilent method

head(stateSubset,2)

dim(stateSubset)

ggplot2

install.packages(‘ggplot2’, dep = TRUE)

Sean Taylor

Good data science comes from good questions, not from fancy techniques, or having the right data. It comes from motivating your research with an idea that you care about, and that you think other people will care about.

Gene Wilder

“Success is a terrible thing and a wonderful thing… Just do what you love.”

John Turkey

An ==approximate answer to the right problem== is worth a good deal more than the exact answer to an approximate problem

data wrangling

tidyr

dplyr

https://s3.amazonaws.com/udacity-hosted-downloads/ud651/DataWranglingWithR.pdf

pseudo_facebook

setwd("/Users/yuchaojiang/Downloads/EDA_Course_Materials/lesson3")

pf <- read.csv("pseudo_facebook.tsv", sep = '\t')
names(pf)
library(ggplot2)
qplot(x =dob_day, data = pf) +
  scale_x_continuous(breaks=1:31)

ggplot(data = pf, aes(x = dob_day)) + 
  geom_histogram(binwidth = 1) + 
  scale_x_continuous(breaks = 1:31) + 
  facet_wrap(~dob_month)

qplot(x=friend_count,data=pf, xlim=c(0,1000))

qplot(x = friend_count, data = pf, binwidth = 25) + 
  scale_x_continuous(limits = c(0, 1000), breaks = seq(0, 1000, 50))

ggplot(aes(x = friend_count), data = subset(pf, !is.na(gender))) + 
  geom_histogram() + 
  scale_x_continuous(limits = c(0, 1000), breaks = seq(0, 1000, 50)) + 
  facet_wrap(~gender)

table(pf$gender)
by(pf$friend_count,pf$gender,summary)

qplot(x=tenure/365, data = pf, binwidth = .25, 
      color = I("black"),fill = I('#F79420'))+
  scale_x_continuous(breaks=seq(1,7,1),limits=c(0,7))+
  xlab('Number of years using Facebook') + 
  ylab('Number of users in sample')

qplot(x=age, data = pf, binwidth = 1, 
      color = I("black"),fill = I('#F79420'))

summary(pf$age)

# lesson 4
qplot(age, friend_count,data=pf)

ggplot(aes(x=age, y=friend_count),data=pf)+
  geom_jitter(alpha=1/20)+
  xlim(13,90)

install.packages('dplyr')
library(dplyr)
age_groups <- group_by(pf,age)
pf.fc_by_age <-summarise(age_groups,
            friend_count_mean= mean (friend_count),
            friend_count_median= median(friend_count),
            n=n())
head(pf.fc_by_age)

ggplot(aes(x=age, y=friend_count),data=pf)+
  xlim(13,90)+
  geom_point(alpha=0.05,
             position= position_jitter(h=0),
             color= 'orange')+
  coord_trans(y='sqrt')+
  geom_line(stat='summary', fun.y=mean)+
  geom_line(stat='summary',fun.y=quantile,fun.args=list(probs= 0.1),linetype=2,color='blue')+
  geom_line(stat='summary',fun.y=quantile,fun.args=list(probs= 0.5),linetype=2,color='red')+
  geom_line(stat='summary',fun.y=quantile,fun.args=list(probs= 0.9),linetype=2,color='black')

cor.test(pf$age,pf$friend_count,method="pearson")

with(subset(pf,age<=70),cor.test(age,friend_count,method="pearson"))

with(subset(pf,age<=70),cor.test(age,friend_count,method="spearman"))

Monday, August 22, 2016

Introduction to Java programming

learning source

http://www.tutorialspoint.com/java/index.htm

two types of error

compiler-time error: syntax error
run-time error: logic error

algorithms

unambiguous
executable
terminating

methods

accessor ( don’t change property)
mutator

object

you ask object to do work. you don’t know how they do that
what, not how

comment //

documentation and api

http://docs.oracle.com/javase/7/docs/api/java/lang/String.html

stringObject.replace(char target, char replacement)

import java.awt.Graphics2D;
import java.awt.geom.Rectangle2D;

code mold

public class XX
{
  public static void main(String args[])
  {
    System.out.println();
  }
}

public interface

public void addFriend(Person friend)
public void unFriend(Person nonfriend)
public String getFriend()

Variables

Instance variables (non-static variables)

Class variables( Static variables)

Local variables

example

/**
   A simulated car that consumes gas as it drives.
*/
public class Car
{
    private double milesDriven;
    private double gasInTank;
    private double milesPerGallon;


    /**
       Constructs a car with a given fuel efficiency.
    */
    public Car(double mpg)
    {
        milesDriven = 0;
        gasInTank = 0;
        milesPerGallon = mpg;
    }


    /**
      add gas to the Tank of car
      @param amount the amount of gas added into the Tank
    */

    public void addGas(double amount)
    {
        gasInTank = gasInTank + amount;
    }

    /**
      Gets the current amount of gas in the tank of this car.
      @return the current gas level
    */
    public double getGasInTank()
    {
        return gasInTank;
    }

    /**
      Drives this car by a given distance.
      @param distance the distance to drive
    */
    public void drive(double distance)
    {
        this.milesDriven = this.milesDriven + distance;
        double gasConsumed = distance / this.milesPerGallon;
        this.gasInTank = this.gasInTank - gasConsumed;
    }  

    /**
      Gets the current mileage of this car.
      @return the total number of miles driven
    */
    public double getMilesDriven()
    {
        return milesDriven;
    }
}

public class Friend
  {
  private String name;
  private String friends;
  public void addFriend(Person friend)
  public void unFriend(Person nonfriend)
  public String getFriend()
  }

fundamental data type

overflow

doubles are fuzzy

cast: (int)(3.35)

grayscale: Y=0.2126R+0.7152G+0.0722*B

sunset effect: +25

final int MAX_RED=255;

System.out.printf(“%8.2f\n”, price); // 8 character, 2 decimal points, float type

https://www.udacity.com/wiki/cs046/factsheets?_ga=1.92994461.1633053593.1468204228

import java.util.Scanner;

Scanner in= new Scanner(System.in);

int age= in. nextInt();

java.lang.Math

Math.pow(a, n);

Math.sqrt(100);

Math.max(a,b);

decision

if ()
  {}
else if()
  {}
else
  {}

public static int SECONDS_PER_MINUTE=60;

final int SECONDS_PER_MINUTE=60;

they are never exactly the same

String.equal(); // not ==

final double EPSILON=1e-12;

Math.abs(x-y)<EPSILON; // not ==

loop

for (int i=1;i<=6;i++){} // i is local variable

for (int value:values){} // values is an array

Debugger

break point
single step
inspect variables

ArrayList vs Array

ArrayList values= new ArrayList();

method: get(), set(),add()

double[] values=new double[10];

double[] values={32,54,67.5,29,35};

create a package

Basically, there are two ways:

use package statement in the first line in the source file,then

javac -d . file_name.java
use javac -d Destination_folder file_name.java

To use the classes in a package:

import Destination_folder.*

import java.util.

Scanner, ArrayList, Arrays, Random

Interface

public interface Drawable
{
  void draw();  // automatically pubic, no implementation
}
public class house implements Drawable

Sunday, August 14, 2016

Apple invented a language called swift in 2014. So it’s a pretty new one and includes many good merits from other languages. Basically, its grammer is very similar to JavaScript. The nice part is swift introduces “struc“ that groups relevant data. And its IDE Xcode provide immediate feedback.

I am not interesting in developing a game now. But it’s good to know it.

learning sources:

https://developer.apple.com/support/development/

https://developer.apple.com/library/ios/referencelibrary/GettingStarted/DevelopiOSAppsSwift/

support both inferred typing and explicit typing

Once the type is declared or set, it can’t change.

var mylove = “w” //inferred typing, will guess it as string

var mylove: Character =”w”//explicit typing

var islove: Bool = true

var secretNumber = 7

var price: Double = 2.50

let pi = 3.14 //constant

Naming

keyword:

var, let, class, import, private, operator.

https://developer.apple.com/library/ios/documentation/Swift/Conceptual/Swift_Programming_Language/LexicalStructure.html#//apple_ref/doc/uid/TP40014097-CH30-ID413

Hungarian Notation:

`intNumberOfLives
countNumberOfLives
sumTotalScore

Camel Casing:

totalCumulativeScore
secondsSinceLastUpdate
minutesTillLaunch

Naming constant:

PointsPerLife
DefaultGreeting
MaxLength
POINTS_PER_LIFE
DEFAULT_GREETING
MAX_LENGTH

name for a struct a capital letter:

struct Student {
    let name: String
    var age: Int
    var school: String
}

var ayush = Student(name: "Ayush Saraswat", age: 19, school: "Udacity")

Saturday, August 13, 2016

IPND, stage 5, front-end

Update on 2017-3-6

7 months I had a quick feeling of JavaScript. I didn’ t process because I am more interested in the content than the face. And lots of tricky details and fine tunes are somewhat overwhelming to me.

Now I have the content, the knowledge discovered by applying machine learning models to lots of data. I realize the last mile is the data presentation, the packaging, the user experience. It’s worth to spend time on polishing the high-quality content, which helps spread the insight to more viewers.

JavaScript

Some basic syntax.

console.log("Hello World!");
var a="yourname";
var b= a.replace("your","my");
var y=function(x){ return z};
array.length;
array.slice;
a.toUpperCase();
array.pop();
array.push();
.split(" ");
.join(" ");
no class but object {};
var bio={"name":"James",
"age":32};
bio.city="Norman";  // dot notation
bio["city"]="Norman"; // square bracket notation is better because special characters or space is tolerent.
if (condiction){}
else{}

jQuery

jQuery is a popular JavaScript library for reading and making changes to the Document Object Model (DOM). The DOM is a tree that contains information about what is actually visible on a website.

While HTML is a static document, the browser converts HTML to the DOM and the DOM can change. In fact, JavaScript’s power comes from its ability to manipulate the DOM, which is essentially a JavaScript object. When JavaScript makes something interesting happen on a website, it’s likely the action happened because JavaScript changed the DOM. jQuery is fast and easy to use, but it doesn’t do anything you can’t accomplish with vanilla (regular) JavaScript.

jQuery was first released in 2006. The latest version is 3.1.1, released on 2016.9.22. It occupies 96% of the JavaScript library market share and is deployed in 70 M websites. The runner-up, “Bootstrap”, has 7 M websites. To my surprise, D3 is only deployed in 12 K websites, probably because it is new.

jQuery’s syntax is designed to make it easier to navigate a document, select DOM elements, create animations, handle events), and develop Ajax) applications.

the basics

http://learn.jquery.com/

<!doctype html>
<html>
<head>
    <meta charset="utf-8">
    <title>Demo</title>
</head>
<body>
    <a href="http://jquery.com/">jQuery</a>
    <script src="jquery.js"></script>
    <script>
    // Your code goes here.
    </script>
</body>
</html>

To run code as soon as the document is ready to be manipulated,

$( document ).ready(function() {});

$ is simply an alias for jQuery because it is shorted and faster to write. It is essentially a window object.

if other JavaScript library wants to use the $ namespace, you can redefine an alias $j for jQuery:var $j = jQuery.noConflict();

Alternatively, you can use a locally-scoped $

jQuery.noConflict();
jQuery( document ).ready(function( $ ) {
    // locally-scoped $ as an alias to jQuery.
    $( "div" ).hide();
});

// The $ variable now has the prototype meaning, which is a shortcut for
// document.getElementById(). mainDiv below is a DOM element, not a jQuery object.
window.onload = function(){
    var mainDiv = $( "main" );
}

.attr() method is a setter for 2 inputs, and a getter for 1 input:

$( "a" ).attr( "href", "allMyHrefsAreTheSameNow.html" );
$( "a" ).attr({
    title: "all titles are the same too!",
    href: "somethingNew.html"
});
$( "a" ).attr( "href" );

.html() – Get or set the HTML contents.
.text() – Get or set the text contents; HTML will be stripped.
.attr() – Get or set the value of the provided attribute.
.width() – Get or set the width in pixels of the first element in the selection as an integer.
.height() – Get or set the height in pixels of the first element in the selection as an integer.
.position() – Get an object with position information for the first element in the selection, relative to its first positioned ancestor. This is a getter only.
.val() – Get or set the value of form elements.

jQuery Object

working directly with DOM elements is awkward. By wrapping it in a jquery object, life is much easier. Following are equivalent codes that are implemented by raw JavaScript and jQuery.

var target = document.getElementById( "target" );
target.innerHTML = "<td>Hello <b>World</b>!</td>";
$( target ).html( "<td>Hello <b>World</b>!</td>" );

var target = document.getElementById( "target" );
var newElement = document.createElement( "div" );
target.parentNode.insertBefore( newElement, target.nextSibling );
$( target ).after( newElement );

jQuery UI

This is really cool. With a single line of code, you have a pull-down calendar. Detailed introduction deserves another post.

JSON

JavaScript Object Notation. JSON is very handy to store hierarchal information. The highly flexible means vulnerable to bugs. Use http://jsonlint.com/ to correct bugs.

{
  "Schools":[
    {
      "name":"Beijing University of Posts and Communications",
      "city":"Beijing, China",
      "degree":"BS",
      "major":["Applied Physics"]
    },
    {
      "name":"University of Oklahoma",
      "city":"Norman, OK, US",
      "degree":"PhD",
      "major":["Electrical and Computer Engineering"]
    }
  ]
}

Project: online resume

In the head block, loads style.css and possible script from google map api(You can obtain your own Google Maps API key here)

In the body block, the “main” block has 5 sub tags: with id = header, workExperience, projects, education, mapDiv, lets-connect. Then comes 3 script files: jQuery.js, helper.js, resumeBuilder.js. At last comes the real script, and if certain element is empty, set .style.display = "none";. Alternatively, we can also set .style.backgroundColor = "black";

To avoid script attack, you may use regular expression to catch all the <and > and replace:

var charEscape = function(_html) {
    var newHTML = _html;
    newHTML = _html.replace(/</g, "&lt;");
    newHTML = newHTML.replace(/>/g, "&gt;");
    return newHTML;
};

frustrated by tons of errors, install a local JavaScript IDE: webStorm

script path: /usr/local/bin/webstorm

google map API is cool!. “initializeMap” function is 100 lines of code, really a monster, and many imbedded functions like locationFinder, createMapMarker,callback, pinPoster.

Tuesday, August 30, 2016

Data Analysis with R

Take-home thought

basics

constructs

operational definition

what’s EDA

package

learning sources

basic command

ggplot2

data wrangling

pseudo_facebook

Monday, August 22, 2016

Introduction to Java programming

learning source

two types of error

algorithms

methods

object

documentation and api

code mold

public interface

Variables

example

fundamental data type

import java.util.Scanner;

java.lang.Math

decision

they are never exactly the same

loop

Debugger

ArrayList vs Array

create a package

Interface

Sunday, August 14, 2016

IPND, stage5, iOS

learning sources:

support both inferred typing and explicit typing

Naming

Saturday, August 13, 2016

IPND, stage 5, front-end

Update on 2017-3-6

JavaScript

jQuery

the basics

JSON

Project: online resume