Wednesday, March 1, 2017

Data Analyst ND 3, data visualization by D3

Visualization Fundamentals

Jonathan: data visualization is about conveying a story or an idea as efficiently as possible. A picture is worth 1 k words.
Ryan: I think Visual Display of Quantitative Information by Edward Tufte gets the core:how best to represent visually some underlying data using color, size, shape to convey some information or some insight to their audience and their reader. And it actually goes a little further than that, incorporating storytelling and narrative elements, to tell some interesting insight they discovered on their own and want to share it with their audience.
What is a good visualization depends on the purpose of the data visualization:
  • exploratory: trys to get a sense of what the data is, what it can tell you, turn over 100 rocks to find 1 or 2 interesting nuggets.
  • explanatory: once you found the nuggets, connect things in interesting ways, look at data from different angles.
    1. have a really robust understanding of the context. Who is your audience and what they need to know or do before you show.
    2. choose an appropriate type of visual. what’s the most straightforward fashion for audience?
    3. clutter. cutting uninformative info, decrease cognitive load, so important data stand out more.
    4. draw attention to where you want them to pay it. use color, size and placement on page.
Your greatest insight is only as good as your ability to communicate it. So don’t spend too much time on complicated model.
How much do you identify yourself as:
  • designer
  • engineer (computer science)
  • storyteller (communicator)
There is no one place to get all the skills.
A pipeline of (acquire, parse), (filter, mine), (represent, refine),(interact) can be identified as computer science, statistical learning, graphic design and Infovis and HCI.
basically 3 steps: data wrangling, data mining, and data visualization.
retina to brain is 10 M bits/second.
Encoded: location, color, shape, size
Examples:

visualization spectrum

productivity visual technologies metaphor
Raw, Chartio, Tableau Excel
high NVD3, Dimple.js, Rickshaw python, ruby
median D3.js C,C++
low WebGL, HTML5 Canvas,SVG assembly
Strike a balance between abstraction and flexibility.
D3 is Data Driven Documents, built on CSS, HTML, JavaScript, SVG.
DOM(document object model) is created during page load. It is accessed by JavaScirpt API. It is a specification and a hierarchical object.
SVG(scalable vector graphic) is a graphic object that has a scalable size.

D3 building blocks

D3 was born on 2011.2.18 by Mike Bostock. version 4.0 was released on 2016.6.2. The major change is the namespace is flat rather than nested.

environment setup: loading D3 specification

note: In the Udacity video, they used v3 as shown in their html source file: <script src="./lib/d3.v3.js.min"></script>
D3 is a client-side JavaScript library.
  1. Copy the D3.js source code into the browser console.
  2. Alternatively, write this in console:
var script = documment.createElement('script');
script.type ='text/javascript';
script.src = "https://d3js.org/d3.v4.min.js";
document.head.appendChild(script);

use D3 module to control DOM

document.getElementById(); #  return DOM node
document.querySelector('.main'); #  return DOM node
var elem = d3.select('.main'); # return an array with D3 object
elem.style('background-color','#757c81');
d3.selectAll('.navar');
d3.select('.main-title').text('china');
d3.select('.navbar-brand.logo');
var parent_el = d3.select('#header-logo');  # pass id using hashtag
parent_el.select('img').attr('alt','Udacity');
d3.select('#header-logo').select('img'). attr('src','./assets/udacity_white.png') ; # change logo
d3.select('.main').html('');  # empty the body
var svg = d3.select('.main').append('svg'); # add svg element
svg.attr('width', 600).attr('height',300); # change size
var y = d3.scale.linear().domain([15,90]).range([250,0]); # map function
var x = d3.scale.log().domain([250,100000]).range([0,600]);
var r = d3.scale.sqrt().domain([52070,138000000]).range([10,50]);
svg.append('circle').attr('fill','red').attr('r',50).attr('cx',398).attr('cy',43);
Some methods are both getter (pass 1 parameter) and setter (pass 2 parameters)

let’s make a bar chart

some free tools

http://app.rawgraphs.io/ Drag data and see magic!

be kind to color-blinder

10% male and 1% female.
Edward Tufte:“Indeed, so difficult and subtle that avoiding catastrophe becomes the first principle in bringing color to information: Above all, do no harm.”

something interesting about the inventor Mike Bostock

An “ask me anything” section on reddit. Many interesting first-hand materials.
What was the defining moment you realized you had to create D3?
The defining moment was when I got the data-join working for the first time. It was magic. I wasn’t even sure I understood how it worked, but it was a blast to use. I realized there could be a practical tool for visualization that didn’t needlessly restrict the types of visualizations you could make.
That was just a brief moment, though. The longer effort started with Protovis, which was a response to the limitations of chart typologies. I wanted something that gave the designer greater control over the output—the kind of control that the early practitioners like Minard, Playfair and Bertin had because they did things by hand. Even within Protovis I felt like I was limited by its mark types; I wanted something that could use all of the DOM and SVG.
what motivated your career choices outside of the academy?
I wish I could pick and chose the best parts of academia and industry, but I’m not sure it’s possible. The primary advantage of academia in my view is that you can afford a long-term perspective (assuming you have funding taken care of): you care about advancing human understanding and not “capturing value.” Yet the danger of academia is that it can easily become too abstract. There are many important, solvable real-world problems that are uninteresting in the academic sense. Ideally you find a way to be productive in the short term whilst moving towards true innovative in the long term.
You mentioned your first steps in programming, but where did you pick up design? Was it a deliberate thing, did someone force you to take courses, did it happen accidentally …?
I studied Human-Computer Interaction as an undergrad, and Don Norman’s book The Design of Everyday Things greatly resonated with me. Once you start thinking about design it becomes impossible to stop, and often greatly frustrating to see so many examples of bad design out in the world.
Tufte’s books were also a huge influence for me. I suspect that the undergraduate (and later graduate) courses were probably the strongest force pushing me to think critically about design, so finding a course you can audit would probably be the best—a little secret about academia is that professors often don’t mind you sitting in on lectures, provided you ask first. A reading list from an introductory HCI course would also be a good place to start.
What’s your daily routine?
Ha. My routine is totally off at the moment because we have a newborn. I get up, feed my daughter, bike her to school, and then come home to help my wife look after the baby, run errands and clean up around the house. (And sometimes, play Hearthstone.) Hopefully… sometime soon I’ll be able to find some quiet space, because as ecstatic as I am about our new family member I still hope to be able to work again.
Before we had children, I would often get excited about ideas and experiments and tinker on them late into the evening and on weekends. I find it to be the easiest thing in the world to work on something if you are passionate about it, and you can break it up into small pieces (like examples) that you can publish and share with others for external validation. So probably, choosing to work on things you are excited about, and then finding space to avoid distractions or interruptions is the key.
“Why? It’s hard to go beyond incremental maintenance of open-source projects while publishing on deadline. Long thoughts take time.” Mike Bostock is the man behind the widely used D3.js

interactive and animation

draw world map!
var projection = d3.geo.mercator();
var path = d3.geo.path().projection(projection);
var map = svg.selectAll('path').data(geo_data.features).enter()
.append('path').attr('d',path)
.style('fill','rgb(9,157,217)').style('stroke','black').style('stroke-wdith',0.5);
...
d3.json("world_countries.json", draw);
Alternative, R and python has its own world map: https://pypi.python.org/pypi/basemap/1.0.7
The most important thing is to decide what you want audience to know and how you are going to show that. As a data scientist, it’s your job to tell people what they need to know.

1 comment:

  1. A very interestinng read I must say. The posts on this blog are very helpful and informative.A friend recommended me this blog and since then I am a regular follower.

    ReplyDelete

Note: Only a member of this blog may post a comment.