Monday, June 26, 2017

T-SQL(SQL Server 2008) and SSIS

T-SQL is a dialect Microsoft SQL Server shares with Sybase SQL Server due to its legacy. The query declaratively specifies what is to be retrieved and query processor chooses the shortest possible time called query optimization. It can be seen as an extension to SQL server with some specialization.
A database modeling is not a place to express your inner creativity and find wild and crazy new ways of doing things. If you want to do wild and crazy, do it in your user interface in your application. In your database, I am afraid you want to be patient methodical step by step.
Questions:
  1. what’s it for? what’s the point of this database? In most case you’re building a database to support an application. Be as specific as possible.
  2. what do you have already? Don’t just import. The existing database may have some problem. What’s the existing process? Understanding the data is essential.
  3. what tables do you need? Real ones and abstract ones.
  4. what columns do you need? what data type? so it can be efficient about storing it. Flexibility is your friend but it’s not what you’re looking for in your database. email address pattern? Be as exact as possible means sequel server will enforce rules on those columns.
  5. what’s your primary key?
  6. what relationships do you need?
Attached a database:
Launch SSMS.
Connect to your SQL Server Instance.
Right-click on Databases in the Object Explorer.
Click Attach.
In the Attach Databases window, click the Add button.
Navigate to the directory containing the.MDF and.LDF files.
Select the.MDF file, and press OK.
Press OK again to attach the database.
trouble shooting is https://learningsqlserver.wordpress.com/2011/02/13/how-can-i-open-mdf-and-ldf-files-in-sql-server-attach-tutorial-troublshooting/
  • The keyword GO separates statements when more than one statement is submitted in a single batch. GO is optional when the batch contains only one statement.

data type

  • float(n), n is number of bytes for storage, 1<=n<=53
  • real, equal to float(4), value on the scale of 1e38

install

MS SQL Server 2008 R2 SP2 Express Edition
https://www.microsoft.com/en-us/download/confirmation.aspx?id=30438
make sure to download the large one (1GB) which has management tool.
install SQL Server 2008 in a non-clustered environment
instance: jychstar
instance root directory: C:\Program Files\Microsoft SQL Server\MSSQL10_50.JYCHSTAR
mix mode, password: 8920113
open SQL Server management studio
log in
right click on DataBases -> New DataBases -> SampleDB
This is equivalent to
CREATE DATABASE [SampleDB] ON  PRIMARY 
( NAME = N'SampleDB', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL10_50.JYCHSTAR\MSSQL\DATA\SampleDB.mdf' , SIZE = 3072KB , MAXSIZE = UNLIMITED, FILEGROWTH = 1024KB )
 LOG ON 
( NAME = N'SampleDB_log', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL10_50.JYCHSTAR\MSSQL\DATA\SampleDB_log.ldf' , SIZE = 1024KB , MAXSIZE = 2048GB , FILEGROWTH = 10%)
GO
-- list table
SELECT * FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_TYPE='BASE TABLE'
create a table
dbo stands for database owner. If you are an administrator, dbo is the default schema.
create table dbo.table_test(
Person_ID int identity(1,1),
First_Name varchar(250),
Last_Name varchar(250),
MIddle_Name varchar(75),
Created_Data datetime not null default(getdate()),
Updated_Data datetime,
Active bit not null default(1)
);
identity(1,1) ensure an unique ID, which automatically increases 1 from 1.
insert into table
insert into table_test(First_Name, Last_Name, MIddle_Name)
values('Yuchao', 'Jiang','')
-- column list can be skipped if the values are in order
-- null allowed column can be dropped in column list

insert into table_test(First_Name, Last_Name, MIddle_Name)
select 'Yuchao1', 'Jiang2',''
union all
select 'Yuchao3', 'Jiang4','' 

update table_test
set First_Name = 'John'
where Last_Name = 'Jiang'
inquiry
select * from table_test
where First_name like 'Yucha%'
there is no keyword “limit” in sql server, you have to do like:
with r as 
(select ROW_NUMBER() over(order by First_Name desc) AS row_num,
First_Name, Last_Name from table_test)

select * from r
where row_num <=10

select top 10 * from table_test
group and count
select a,b,cout (*) as cnt
from table_test
group by a,b
having cnt = 2
order by cnt desc
partition and rank
partition give different number in the same group, rank gives same number for the same group
select row_number() over (order by cusotermerName) as ordernumber
row_number() over (partition by vendorName order by vendorName) as vendorname
dense_rank() over (order by cusotermerName) as customernumber
cusotermerName, productName,Amount,vendorName
from sales

SSIS

SSIS(sql server integration services), along with SSAS(analysis), SSRS(report), forms the so-called “business intelligence”. SSIS has a lot of overlap with ETL(Extraction, Transform and load). The main goal is to collect data from various sources, consolidate or gather data at on location for analyzing.
start -> sql server 2008 R2 -> sql server BI development studio -> new project -> template: integration services project, name: SSIS tutorial, location:C:\Users\Yuchao_Macbook\Documents\Visual Studio 2008\projects, clear “create directory for solution”
better tutorial: https://www.youtube.com/watch?v=I_Ae3suaL-U
SSIS is kind of visual approach to import raw data into database. You use various modules to connect things together. It has control flow, which is composed of different data flow.
However, the 2008 version seems to have some bugs. I can’t go through every detail right now.