起因,早上起床,上卫生间去,发现灯开着,以为有人,过了20分钟再去,发现灯还开着,敲门,没人,嘟囔句"毛病",然后隔壁就闹起来了,闹着闹着,老太太就开始推我了,本着绝不动手打妇孺的原则,没有动手,然后那家男人冲出来了,也开始推我,这时候就有四个人(老太太,泼妇,小男人,大男人)来推我了.混乱中,共计眼镜被搞掉一副,衣服被推掉,鞋被推掉,抓痕若干,然后出手一次,标准的直拳哦.
推到最后,推到我们的房间里,觉得事情已经没有办法了,打110,居然第一次的时候挂我电话,对方还假装打电话给110,"喂,110吗?我是XXX",
太搞了,装做和对方认识的样子.我还以为他们打通了,在等警察同志来.后来还去问了下,没有回答我的问题,然后王同学就打电话啦,号称杀人了,然后说打起架来了.不一会儿警察来了(一个正式警察和一个保安).敲门还挺轻了,然后经验丰富知道是邻里纠纷开始询问,各自表达,我表现得极其冷静,对方和王同学就互相指责.泼妇生成老太太病了什么的,也太低级了,没有人相信,包括她自己,警察和我们.老太太则是使用了温情手段,还说孙子画了个东西,想让他们过来团年什么什么的,小男人没有出现,大男人一直在.询问了当,双方还是互相争执,警察烦啦,让我们去派出所协调去,然后一行若干人坐着警车去了派出所,协调人员来了,说了说情况,反正是各打三十大板,说说各自的损失,然后主要就是要赔我的眼镜了,皮外伤就算了.我的眼镜三年前买的,然后差不多300买的,要他们赔150,对方还价100,当然这个过程是双方各在一个房间里面.完事.
没有什么新奇的东西,倒是要小心有老人的时候.
另,王同学曾经踹开过中介的门,恐怕对方利用这点把里面的东西损坏了,或者偷走了,让我们背黑锅.这就麻烦了,本来回来的时候就准备说先让用手机拍一下房间,然后后反锁门的,回来以后发现门已经被反锁了,也不知道对方有没有做什么手脚.
Wednesday, May 12, 2010
Wednesday, May 05, 2010
study notes
check the disk space on AIX
check certain file du -ms . m means size in megabyte, and s means return single line.
check the disk df -m
Tuesday, May 04, 2010
这是一出悲剧么?
故事的开始总是美好的。
但是中间发生了一些事情,不知道是不是太年轻了。
之前说过说我四差。我是一个很容易认真的人,她也是。在愤怒的时候,她常常不能控制自己,使用最伤人心的话语。
从一开始的面对这些辱骂的愤怒和失望 到听到拒绝的话语的害怕 到听到责难的时候的宽容、无视 到心灰意冷。
2010.4.8
早上的时候做了一个梦,梦里和谁谁谁在一起,然后就在路上走着走着,碰到水坑啦(其实记不起来了),然后谁谁谁就说都怪你,blabla, 然后我就醒了,果然已经习惯了么。
Wednesday, April 28, 2010
Avoid warning when access remote server via ssh with multiple accounts on one client.
AT.
If you put all the public key of local client on the server, and then you connect it via ssh, it will promote a warning.
Warning: the RSA host key for 'a.b.com' differs from the key for the IP address 'x.x.x.x'.
This happened because the ssh server check the key according to the IP address, and since we have multiple account on local client,The server will recored different RSA key from a unique IP,It will result in conflict.
It could be solved by setting the option,
check out ssh_option(5)
e.g sudo -u super_a ssh -o checkhostip="no" username@remote_host 'echo "hi"'.
Friday, April 23, 2010
study notes
umask value
if the previous default permission for file is XXX,set the default permission to XXX &~ $value
perl Number
perl Number
$n = 1234; # decimal integer
$n = 0b1110011; # binary integer
$n = 01234; # octal integer
$n = 0x1234; # hexadecimal integer
$n = 12.34e-56; # exponential notation
$n = "-12.34e56"; # number specified as a string
$n = "1234"; # number specified as a string
$n = 0b1110011; # binary integer
$n = 01234; # octal integer
$n = 0x1234; # hexadecimal integer
$n = 12.34e-56; # exponential notation
$n = "-12.34e56"; # number specified as a string
$n = "1234"; # number specified as a string
use taglist to improve vim
1 download Exuberant ctags extract it, add the dir to PATH.
2 download taglist plugin extract it to ~/.vim/
3 restart vim,run :TlistOpen in vim
4 done
you may want to read
example of vim setting
set nobackup
set nu
set tabstop=4 " numbers of spaces of tab character
set shiftwidth=4 " numbers of spaces to (auto)indent
set ignorecase " ignore case when searching
behave mswin
"taglist
let Tlist_Sort_Type = 1 "list the tag by name alphabetic
let Tlist_GainFocus_On_ToggleOpen =1 "jump on one click
let Tlist_Auto_Highlight_Tag=1 "Automatically highlight the current tag in the taglist.
let Tlist_Auto_Update =1 "Automatically update the taglist to include newly edited files.
let Tlist_Exit_OnlyWindow =1 "Close Vim if the taglist is the only window.
set nu
set tabstop=4 " numbers of spaces of tab character
set shiftwidth=4 " numbers of spaces to (auto)indent
set ignorecase " ignore case when searching
behave mswin
"taglist
let Tlist_Sort_Type = 1 "list the tag by name alphabetic
let Tlist_GainFocus_On_ToggleOpen =1 "jump on one click
let Tlist_Auto_Highlight_Tag=1 "Automatically highlight the current tag in the taglist.
let Tlist_Auto_Update =1 "Automatically update the taglist to include newly edited files.
let Tlist_Exit_OnlyWindow =1 "Close Vim if the taglist is the only window.
Tuesday, April 20, 2010
cygwin set display languge
add
export LANG='C.UTF-8'
to set the language to English.
or
export LANG='zh-CN'
to set the language to Chinese .
Wednesday, April 14, 2010
study note
Perl sort a hash
1 sort by key
foreach $key (sort (keys(%hash_name))) {
print "\t\t$key \t\t$hash_name{$key}\n";
}
1 sort by key
foreach $key (sort (keys(%hash_name))) {
print "\t\t$key \t\t$hash_name{$key}\n";
}
2 sort by $value
make use of perl sort
foreach my $key ( sort { $rules{$a} <=> $rules{$b} } ( keys %hash_name ))
{
print "$key => $hash_name{$key}\n";
}
if you want get the set sorted numeric and desc
{
print "$key => $hash_name{$key}\n";
}
if you want get the set sorted numeric and desc
switch $a and $b e.g
foreach my $key ( sort { $rules{$b} <=> $rules{$a} } ( keys %hash_name ))
{
print "$key => $hash_name{$key}\n";
}
Tuesday, April 13, 2010
study notes
getopts for perl?
check out
my %options=();
getopts("a",\%options) or usage();
#HELP_MESSAGE();
addrule() if defined $options{"a"};
getopts("a",\%options) or usage();
#HELP_MESSAGE();
addrule() if defined $options{"a"};
sub HELP_MESSAGE()
{
$Getopt::Std::STANDARD_HELP_VERSION=1;
usage;
}
sub VERSION_MESSAGE()
{
$Getopt::Std::STANDARD_HELP_VERSION=1;
#print `grep "^# VERSION" ./odssvn.pl`;
}
{
$Getopt::Std::STANDARD_HELP_VERSION=1;
usage;
}
sub VERSION_MESSAGE()
{
$Getopt::Std::STANDARD_HELP_VERSION=1;
#print `grep "^# VERSION" ./odssvn.pl`;
}
TEST;
perl t.pl --help
perl t.pl --version
perl t.pl -a
MD5 for perl
use Digest::MD5;
use Digest::MD5;
$ctx = Digest::MD5->new;
$ctx->add($data);
$ctx->addfile(*FILE);
$digest = $ctx->digest;
$digest = $ctx->hexdigest;
$digest = $ctx->b64digest;
Thursday, April 08, 2010
gvim set nobackup
do not generate filename~ file.
try Edit->start setting-> add "set nobackup " under source $VIMRUNTIME/mswin.vim
done
study note
1 To change the owner of the file program.c:
chown jim program.c
The user access permissions for program.c now apply to jim. As the owner, jim can use the chmod command to permit or deny
other users access to program.c.
准备户口迁回家
现在算是口袋户口,身份证还有5年就要到期了,而且要是掉了还超级麻烦的样子。所以准备迁户口。
2从学校迁出 厦大派出所电话 0592-2185893,所需材料,户口页,毕业证,户口迁入地提供的证明。
迁入地重庆老家 重庆户籍室电话 023-63962385
太龙镇 派出所电话023-58501035 张
万州区户政科电话 64880012
1写申请,表明迁入意愿,原因。到向坪社区(找冯春喜)里盖章,然后交给镇派出所 并包括毕业证复印件,新身份证照片(需要身份证制定点)。(本来可能需要本人办,但是乡下好办事,让家人托办就可以了)
迁入地(太龙)派出所会给出迁出地所需的证明。
然后到迁出地(厦大)办,
然后到迁出地(厦大)办,
然后到迁入地(太龙)办。
计划周末去照相,复印毕业证,用快递寄回家让办。
计划周末去照相,复印毕业证,用快递寄回家让办。
中国的户籍制度真是太可恶啦。
Tuesday, March 23, 2010
Sunday, February 07, 2010
数据仓库建模_9
type 3 slowly changing dimension is just added a historical column.
The type 3 slowly changing dimension technique allows us to see new and
historical fact data by either the new or prior attribute values.
The type 3 slowly changing dimension technique allows us to see new and
historical fact data by either the new or prior attribute values.
Sunday, January 24, 2010
Intresting
Last night I was thinking about dirty things. All in a sudden,like there is a switch, I mind just became clean and I just be not drowned in it anymore. It's really a strange experience.
Thursday, December 31, 2009
数据仓库建模_8
Common Matrix Mishaps
Departmental or overly encompassing rows.
Report-centric or too narrowly defined rows.
Overly generalized columns.
Separate columns for each level of a hierarchy.
Slowly Changing Dimensions
Type 1: Overwrite the Dimension Attribute
Type 1: Overwrite the Dimension Attribute
the original dimension, only the latest infor is saved.
Type 2: Add a New Dimension Row
the time period attribute is added.
Type 3: Add a New Dimension Attribute
add a version to attribute, user can choose any version of dimension. very useful when user want to switch the dimension.
Mini-Dimensions: Add a New Dimension
break a huge dimension into stable parts and frequently changing parts.
Hybrid Slowly Changing Dimension Techniques
try to adopt the advantage of above basic kinds of dimensions. Be care when use it.
Role-Playing Dimensions
A single dimension used many place of a model could be addressed as 'Role-Playing Dimensions'
Junk Dimensions
When the dimension cannot deal with legacy flag or text attribute. Try these
Leave them in the fact table.
but it's worst practice.
Make them into separate dimensions.
it may lead to the exploring of dimensions. RimBall says "we strive to have fewer than twenty foreign keys in a fact table for most industries and business processes. "
Eliminate them.
if it's meaningless, yield them.
Note Junk dimension is a highly technical term; try use some neuter like "invoice indicator".
Snowflaking and Outriggers
A snowflaking dimension is normalized dimension. seperated and then linked back .
An outrigger is useful if most of the dimension records share handful common informations.
three types of dimension
Transaction Fact Tables
Periodic Snapshot Fact Tables
Accumulating Snapshot Fact Tables
Factless Fact Tables
Monday, December 28, 2009
数据仓库建模_7
Application Architecture Document
• EXECUTIVE SUMMARY
o Business Understanding
o Project Focus
• METHODOLOGY
o Business Requirements
o High-Level Architecture Development
o Standards & Products
o Ongoing Refinement
• BUSINESS REQUIREMENTS AND ARCHITECTURAL IMPLICATIONS
o CRM (Campaign Management & Target Marketing)
o lSales Forecasting
o Inventory Planning
o Sales Performance
o Additional Business Issues
Data Quality
Common Data Elements and Business Definitions
• ARCHITECTURE OVERVIEW
o High Level Model
o Metadata Driven
o Flexible Services Layers
• MAJOR ARCHITECTURAL ELEMENTS
o Services and Functions
ETL Services
Customer Data Integration
External Demographics
Data Access Services
BI Applications (CRM/Campaign Mgmt; Forecasting)
Sales Management Dashboard
Ad Hoc Query and Standard Reporting
Metadata Maintenance
User Maintained Data System
o Data Stores
Sources and Reference Data
ETL Data Staging and Data Quality Support
Presentation Servers
Business Metadata Repository
o Infrastructure and Utilities
o Metadata Strategy
• ARCHITECTURE DEVELOPMENT PROCESS
o Architecture Development Phases
o Architecture Proof of Concept
- 167 -
o Standards and Product Selection
o High Level Enterprise Bus Matrix
APPENDIX A—ARCHITECTURE MODELS
• EXECUTIVE SUMMARY
o Business Understanding
o Project Focus
• METHODOLOGY
o Business Requirements
o High-Level Architecture Development
o Standards & Products
o Ongoing Refinement
• BUSINESS REQUIREMENTS AND ARCHITECTURAL IMPLICATIONS
o CRM (Campaign Management & Target Marketing)
o lSales Forecasting
o Inventory Planning
o Sales Performance
o Additional Business Issues
Data Quality
Common Data Elements and Business Definitions
• ARCHITECTURE OVERVIEW
o High Level Model
o Metadata Driven
o Flexible Services Layers
• MAJOR ARCHITECTURAL ELEMENTS
o Services and Functions
ETL Services
Customer Data Integration
External Demographics
Data Access Services
BI Applications (CRM/Campaign Mgmt; Forecasting)
Sales Management Dashboard
Ad Hoc Query and Standard Reporting
Metadata Maintenance
User Maintained Data System
o Data Stores
Sources and Reference Data
ETL Data Staging and Data Quality Support
Presentation Servers
Business Metadata Repository
o Infrastructure and Utilities
o Metadata Strategy
• ARCHITECTURE DEVELOPMENT PROCESS
o Architecture Development Phases
o Architecture Proof of Concept
- 167 -
o Standards and Product Selection
o High Level Enterprise Bus Matrix
APPENDIX A—ARCHITECTURE MODELS
Wednesday, December 23, 2009
数据仓库建模_6
Extract
• Data profiling (1)
• Change data capture (2)
• Extract system (3)
Clean and Conform
There are five major services in the cleaning and conforming step:
• Data cleansing system (4)
• Error event tracking (5)
• Audit dimension creation (6)
• Deduplicating (7)
• Conforming (8)
Deliver
The delivery subsystems in the ETL back room consist of:
• Slowly changing dimension (SCD) manager (9)
• Surrogate key generator (10)
• Hierarchy manager (11)
• Special dimensions manager (12)
• Fact table builders (13)
• Surrogate key pipeline (14)
• Multi-valued bridge table builder (15)
• Late arriving data handler (16)
• Dimension manager system (17)
• Fact table provider system (18)
• Aggregate builder (19)
• OLAP cube builder (20)
• Data propagation manager (21)
19 and 20 could be done in cognos
ETL Management Services
ETL Management Services
• Job scheduler (22)
• Backup system (23)
• Recovery and restart (24)
• Version control (25)
• Version migration (26)
• Workflow monitor (27)
• Sorting (28)
• Lineage and dependency (29)
• Problem escalation (30)
• Paralleling and pipelining (31)
• Compliance manager (32)
• Security (33)
• Metadata repository (34)
PROCESS METADATA
?ETL operations statistics including start times, end times, CPU seconds used,
disk reads, disk writes, and row counts.
?Audit results including checksums and other measures of quality and
completeness.
?Quality screen results describing the error conditions, frequencies of
occurrence, and ETL system actions taken (if any) for all quality screening
findings.
?ETL operations statistics including start times, end times, CPU seconds used,
disk reads, disk writes, and row counts.
?Audit results including checksums and other measures of quality and
completeness.
?Quality screen results describing the error conditions, frequencies of
occurrence, and ETL system actions taken (if any) for all quality screening
findings.
TECHNICAL METADATA
• System inventory including version numbers describing all the software
required to assemble the complete ETL system.
• Source descriptions of all data sources, including record layouts, column
definitions, and business rules.
• Source access methods including rights, privileges, and legal limitations.
• ETL data store specifications and DDL scripts for all ETL tables, including
normalized schemas, dimensional schemas, aggregates, stand-alone relational
tables, persistent XML files, and flat files.
• ETL data store policies and procedures including retention, backup, archive,
recovery, ownership, and security settings.
• ETL job logic, extract and transforms including all data flow logic
embedded in the ETL tools, as well as the sources for all scripts and code
modules. These data flows define lineage and dependency relationships.
• Exception handling logic to determine what happens when a data quality
screen detects an error.
• Processing schedules that control ETL job sequencing and dependencies.
• Current maximum surrogate key values for all dimensions.
• Batch parameters that identify the current active source and target tables for
all ETL jobs.
BUSINESS METADATA
• Data quality screen specifications including the code for data quality tests,
severity score of the potential error, and action to be taken when the error
occurs.
• Data dictionary describing the business content of all columns and tables
across the data warehouse.
• Logical data map showing the overall data flow from source tables and fields
through the ETL system to target tables and columns.
• Business rule logic describing all business rules that are either explicitly
checked or implemented in the data warehouse, including slowly changing
dimension policies and null handling.
• Data quality screen specifications including the code for data quality tests,
severity score of the potential error, and action to be taken when the error
occurs.
• Data dictionary describing the business content of all columns and tables
across the data warehouse.
• Logical data map showing the overall data flow from source tables and fields
through the ETL system to target tables and columns.
• Business rule logic describing all business rules that are either explicitly
checked or implemented in the data warehouse, including slowly changing
dimension policies and null handling.
We call this usage based optimization
The solution for huge dataset
Typical adjustments include partitioning the
warehouse onto multiple servers, either vertically, horizontally, or both. Vertical
partitioning means breaking up the components of the presentation server architecture
into separate platforms, typically running on separate servers. In this case you could
have a server for the atomic level data, a server for the aggregate data (which may
also include atomic level data for performance reasons), and a server for aggregate
management and navigation. Often this last server has its own caching capabilities,
acting as an additional data layer. You may also have separate servers for background
ETL processing.
Horizontal partitioning means distributing the load based on datasets. In this case,
you may have separate presentation servers (or sets of vertically partitioned servers)
dedicated to hosting specific business process dimensional models. For example, you
may put your two largest datasets on two separate servers, each of which holds atomic
level and aggregate data. You will need functionality somewhere between the user
and data to support analyses that query data from both business processes.
warehouse onto multiple servers, either vertically, horizontally, or both. Vertical
partitioning means breaking up the components of the presentation server architecture
into separate platforms, typically running on separate servers. In this case you could
have a server for the atomic level data, a server for the aggregate data (which may
also include atomic level data for performance reasons), and a server for aggregate
management and navigation. Often this last server has its own caching capabilities,
acting as an additional data layer. You may also have separate servers for background
ETL processing.
Horizontal partitioning means distributing the load based on datasets. In this case,
you may have separate presentation servers (or sets of vertically partitioned servers)
dedicated to hosting specific business process dimensional models. For example, you
may put your two largest datasets on two separate servers, each of which holds atomic
level and aggregate data. You will need functionality somewhere between the user
and data to support analyses that query data from both business processes.
Presentation Server Metadata
PROCESS METADATA
• Database monitoring system tables containing information about the use of
tables throughout the presentation server.
• Aggregate usage statistics including OLAP usage.
• Database monitoring system tables containing information about the use of
tables throughout the presentation server.
• Aggregate usage statistics including OLAP usage.
TECHNICAL METADATA
• Database system tables containing standard RDBMS table, column, view,
index, and security information.
• Partition settings including partition definitions and logic for managing them
over time.
• Stored procedures and SQL scripts for creating partitions, indexes, and
aggregates, as well as security management.
• Aggregate definitions containing the definitions of system entities such as
materialized views, as well as other information necessary for the query rewrite
facility of the aggregate navigator.
• OLAP system definitions containing system information specific to OLAP
databases.
• Target data policies and procedures including retention, backup, archive,
recovery, ownership, and security settings.
• Database system tables containing standard RDBMS table, column, view,
index, and security information.
• Partition settings including partition definitions and logic for managing them
over time.
• Stored procedures and SQL scripts for creating partitions, indexes, and
aggregates, as well as security management.
• Aggregate definitions containing the definitions of system entities such as
materialized views, as well as other information necessary for the query rewrite
facility of the aggregate navigator.
• OLAP system definitions containing system information specific to OLAP
databases.
• Target data policies and procedures including retention, backup, archive,
recovery, ownership, and security settings.
The most important BI application types include the following:
• Direct access queries: the classic ad hoc requests initiated by business users
from desktop query tool applications.
• Standard reports: regularly scheduled reports typically delivered via the BI
portal or as spreadsheets or PDFs to an online library.
• Analytic applications: applications containing powerful analysis algorithms
in addition to normal database queries. Pre-built analytic applications
packages include budgeting, forecasting, and business activity monitoring
(BAM).
• Dashboards and scorecards: multi-subject user interfaces showing key
performance indicators (KPIs) textually and graphically.
• Data mining and models: exploratory analysis of large "observation sets"
usually downloaded from the data warehouse to data mining software. Data
mining is also used to create the underlying models used by some analytic and
operational BI applications.
• Operational BI: real time or near real time queries of operational status, often
accompanied by transaction write-back interfaces.
performance indicators (KPIs) textually and graphically.
• Data mining and models: exploratory analysis of large "observation sets"
usually downloaded from the data warehouse to data mining software. Data
mining is also used to create the underlying models used by some analytic and
operational BI applications.
• Operational BI: real time or near real time queries of operational status, often
accompanied by transaction write-back interfaces.
164
Subscribe to:
Posts (Atom)