Thursday, December 31, 2009

A memory

A new beginning for 2010

数据仓库建模_8

Common Matrix Mishaps
Departmental or overly encompassing rows.
Report-centric or too narrowly defined rows.
Overly generalized columns.
Separate columns for each level of a hierarchy.
Slowly Changing Dimensions
Type 1: Overwrite the Dimension Attribute 
         the original dimension, only the latest infor is saved.
Type 2: Add a New Dimension Row
         the time period attribute is added.
Type 3: Add a New Dimension Attribute
         add a version to attribute, user can choose any version of dimension. very useful when user want to switch the dimension.
Mini-Dimensions: Add a New Dimension
         break a huge dimension into stable parts and frequently changing parts.
Hybrid Slowly Changing Dimension Techniques
        try to adopt the advantage of above basic kinds of dimensions. Be care when use it.
Role-Playing Dimensions
        A single dimension used many place of a model could be addressed as 'Role-Playing Dimensions'
Junk Dimensions
        When the dimension cannot deal with legacy flag or text attribute. Try these
        Leave them in the fact table.
           but it's worst practice.
        Make them into separate dimensions.
           it may lead to the exploring of dimensions. RimBall says "we strive to have fewer than twenty foreign keys in a fact table for most industries and business processes. "
        Eliminate them. 
            if it's meaningless, yield them.
Note Junk dimension is a highly technical term; try use some neuter like "invoice indicator".
Snowflaking and Outriggers
        A snowflaking dimension is normalized dimension. seperated and then linked back .
        An outrigger is useful if most of the dimension records share handful common informations.

three types of dimension 
Transaction Fact Tables
Periodic Snapshot Fact Tables
Accumulating Snapshot Fact Tables
Factless Fact Tables

Monday, December 28, 2009

数据仓库建模_7

Application Architecture Document
• EXECUTIVE SUMMARY
o Business Understanding
o Project Focus
• METHODOLOGY
o Business Requirements
o High-Level Architecture Development
o Standards & Products
o Ongoing Refinement
• BUSINESS REQUIREMENTS AND ARCHITECTURAL IMPLICATIONS
o CRM (Campaign Management & Target Marketing)
o lSales Forecasting
o Inventory Planning
o Sales Performance
o Additional Business Issues
􀂃 Data Quality
􀂃 Common Data Elements and Business Definitions
• ARCHITECTURE OVERVIEW
o High Level Model
o Metadata Driven
o Flexible Services Layers
• MAJOR ARCHITECTURAL ELEMENTS
o Services and Functions
􀂃 ETL Services
􀂃 Customer Data Integration
􀂃 External Demographics
􀂃 Data Access Services
􀂃 BI Applications (CRM/Campaign Mgmt; Forecasting)
􀂃 Sales Management Dashboard
􀂃 Ad Hoc Query and Standard Reporting
􀂃 Metadata Maintenance
􀂃 User Maintained Data System
o Data Stores
􀂃 Sources and Reference Data
􀂃 ETL Data Staging and Data Quality Support
􀂃 Presentation Servers
􀂃 Business Metadata Repository
o Infrastructure and Utilities
o Metadata Strategy
• ARCHITECTURE DEVELOPMENT PROCESS
o Architecture Development Phases
o Architecture Proof of Concept
- 167 -
o Standards and Product Selection
o High Level Enterprise Bus Matrix
APPENDIX A—ARCHITECTURE MODELS

Wednesday, December 23, 2009

倦了

也不想认识谁,也不想联系了,太有挫败感了.

数据仓库建模_6

Extract

• Data profiling (1)
• Change data capture (2)
• Extract system (3)
Clean and Conform
There are five major services in the cleaning and conforming step:
• Data cleansing system (4)
• Error event tracking (5)
• Audit dimension creation (6)
• Deduplicating (7)
• Conforming (8)

Deliver
The delivery subsystems in the ETL back room consist of:
• Slowly changing dimension (SCD) manager (9)
• Surrogate key generator (10)
• Hierarchy manager (11)
• Special dimensions manager (12)
• Fact table builders (13)
• Surrogate key pipeline (14)
• Multi-valued bridge table builder (15)
• Late arriving data handler (16)
• Dimension manager system (17)
• Fact table provider system (18)
• Aggregate builder (19)
• OLAP cube builder (20)
• Data propagation manager (21)

19 and 20 could be done in cognos
ETL Management Services

• Job scheduler (22)
• Backup system (23)
• Recovery and restart (24)
• Version control (25)
• Version migration (26)
• Workflow monitor (27)
• Sorting (28)
• Lineage and dependency (29)
• Problem escalation (30)
• Paralleling and pipelining (31)
• Compliance manager (32)
• Security (33)
• Metadata repository (34)

PROCESS METADATA
?ETL operations statistics including start times, end times, CPU seconds used,
disk reads, disk writes, and row counts.
?Audit results including checksums and other measures of quality and
completeness.
?Quality screen results describing the error conditions, frequencies of
occurrence, and ETL system actions taken (if any) for all quality screening
findings.


TECHNICAL METADATA


• System inventory including version numbers describing all the software
required to assemble the complete ETL system.
• Source descriptions of all data sources, including record layouts, column
definitions, and business rules.
• Source access methods including rights, privileges, and legal limitations.
• ETL data store specifications and DDL scripts for all ETL tables, including
normalized schemas, dimensional schemas, aggregates, stand-alone relational
tables, persistent XML files, and flat files.
• ETL data store policies and procedures including retention, backup, archive,
recovery, ownership, and security settings.
• ETL job logic, extract and transforms including all data flow logic
embedded in the ETL tools, as well as the sources for all scripts and code
modules. These data flows define lineage and dependency relationships.
• Exception handling logic to determine what happens when a data quality
screen detects an error.
• Processing schedules that control ETL job sequencing and dependencies.
• Current maximum surrogate key values for all dimensions.
• Batch parameters that identify the current active source and target tables for
all ETL jobs.

BUSINESS METADATA
Data quality screen specifications including the code for data quality tests,
severity score of the potential error, and action to be taken when the error
occurs.
• Data dictionary describing the business content of all columns and tables
across the data warehouse.
• Logical data map showing the overall data flow from source tables and fields
through the ETL system to target tables and columns.
• Business rule logic describing all business rules that are either explicitly
checked or implemented in the data warehouse, including slowly changing
dimension policies and null handling.

We call this usage based optimization

The solution for huge dataset
Typical adjustments include partitioning the
warehouse onto multiple servers, either vertically, horizontally, or both. Vertical
partitioning means breaking up the components of the presentation server architecture
into separate platforms, typically running on separate servers. In this case you could
have a server for the atomic level data, a server for the aggregate data (which may
also include atomic level data for performance reasons), and a server for aggregate
management and navigation. Often this last server has its own caching capabilities,
acting as an additional data layer. You may also have separate servers for background
ETL processing.
Horizontal partitioning means distributing the load based on datasets. In this case,
you may have separate presentation servers (or sets of vertically partitioned servers)
dedicated to hosting specific business process dimensional models. For example, you
may put your two largest datasets on two separate servers, each of which holds atomic
level and aggregate data. You will need functionality somewhere between the user
and data to support analyses that query data from both business processes.

Presentation Server Metadata
PROCESS METADATA
• Database monitoring system tables containing information about the use of
tables throughout the presentation server.
• Aggregate usage statistics including OLAP usage.
TECHNICAL METADATA
• Database system tables containing standard RDBMS table, column, view,
index, and security information.
• Partition settings including partition definitions and logic for managing them
over time.
• Stored procedures and SQL scripts for creating partitions, indexes, and
aggregates, as well as security management.
• Aggregate definitions containing the definitions of system entities such as
materialized views, as well as other information necessary for the query rewrite
facility of the aggregate navigator.
• OLAP system definitions containing system information specific to OLAP
databases.
• Target data policies and procedures including retention, backup, archive,
recovery, ownership, and security settings.

The most important BI application types include the following:

• Direct access queries: the classic ad hoc requests initiated by business users
from desktop query tool applications.
• Standard reports: regularly scheduled reports typically delivered via the BI
portal or as spreadsheets or PDFs to an online library.
• Analytic applications: applications containing powerful analysis algorithms
in addition to normal database queries. Pre-built analytic applications
packages include budgeting, forecasting, and business activity monitoring
(BAM).
• Dashboards and scorecards: multi-subject user interfaces showing key
performance indicators (KPIs) textually and graphically.
• Data mining and models: exploratory analysis of large "observation sets"
usually downloaded from the data warehouse to data mining software. Data
mining is also used to create the underlying models used by some analytic and
operational BI applications.
• Operational BI: real time or near real time queries of operational status, often
accompanied by transaction write-back interfaces.
164

Tuesday, December 15, 2009

when you get old, you will know which ones could not be friends.

昨天晚上她说五差,背景差,长相差,学历差,收入差,理想差.还说我只能找到一个收入两三千的打工妹.
虽然从这些来看并不是市侩的人,虽然她这些日子很难过,但是作为朋友如果得不到尊重,被看不起,是很难延续下去.虽然我应该大度一些,照顾别人,但是人家还觉得受了这个照顾很憋屈.又何必自找没趣呢?
也许已经习惯了有谁的生活,但是如果不值得(各个层面)的话,又何必延续下去呢.

Thursday, December 10, 2009

探索的动机(爱因斯坦在普朗克生日会上的讲话) : 弯曲评论(zz)

zz
http://www.tektalk.cn/2009/12/06/探索的动机(爱因斯坦在普朗克生日会上的讲话)/
附录:探索的动机(爱因斯坦在普朗克生日会上的讲话)

在科学的庙堂里有许多房舍,住在里面的人真是各式各样,而引导他们到那里去的动机也实在各不相同。有许多人所以爱好科学,是因为科学给他们以超乎常人的智力上的快感,科学是他们自己的特殊娱乐,他们在这种娱乐中寻求生动活泼的经验和对他们自己雄心壮志的满足;在这座庙堂里,另外还有许多人所以把他们的脑力产物奉献在祭坛上,为的是纯粹功利的目的。如果上帝有位天使跑来把所有属于这两类的人都赶出庙堂,那末聚集在那里的人就会大大减少,但是,仍然还有一些人留在里面,其中有古人,也有今人。我们的普朗克就是其中之一,这也就是我们所以爱戴他的原因。

我很明白,我们刚才在想象随便驱逐可许多卓越的人物,他们对建筑科学庙堂有过很大的也许是主要的贡献;在许多情况下,我们的天使也会觉得难于作出决定。但有一点我可以肯定,如果庙堂里只有被驱逐的那两类人,那末这座庙堂决不会存在,正如只有蔓草就不成其为森林一样。因为,对于这些人来说,只要有机会,人类活动的任何领域都会去干;他们究竟成为工程师、官吏、商人还是科学家,完全取决于环境。现在让我们再来看看那些为天使所宠爱的人吧。

他们大多数是相当怪癖、沉默寡言和孤独的人,但尽管有这些共同特点,实际上他们彼此之间很不一样,不象被赶走的那许多人那样彼此相似。究竟是什么把他们引到这座庙堂里来的呢?这是一个难题,不能笼统地用一句话来回答。首先我同意叔本华(Schopenhauer)所说的,把人们引向艺术和科学的最强烈的动机之一,是要逃避日常生活中令人厌恶的粗俗和使人绝望的沉闷,是要摆脱人们自己反复无常的欲望的桎梏。一个修养有素的人总是渴望逃避个人生活而进入客观知觉和思维的世界;这种愿望好比城市里的人渴望逃避喧嚣拥挤的环境,而到高山上去享受幽静的生活,在那里透过清寂而纯洁的空气,可以自由地眺望,陶醉于那似乎是为永恒而设计的宁静景色。

除了这种消极的动机以外,还有一种积极的动机。人们总想以最适当的方式画出一幅简化的和易领悟的世界图像;于是他就试图用他的这种世界体系(cosmos)来代替经验的世界,并来征服它。这就是画家、诗人、思辨哲学家和自然科学家所做的,他们都按自己的方式去做。各人把世界体系及其构成作为他的感情生活的支点,以便由此找到他在个人经验的狭小范围理所不能找到的宁静和安定。

理论物理学家的世界图像在所有这些可能的图像中占有什么地位呢?它在描述各种关系时要求尽可能达到最高的标准的严格精密性,这样的标准只有用数学语言才能达到。另一方面,物理学家对于他的主题必须极其严格地加以控制:他必须满足于描述我们的经验领域里的最简单事件。企图以理论物理学家所要求的精密性和逻辑上的完备性来重现一切比较复杂的事件,这不是人类智力所能及的。高度的纯粹性、明晰性和确定性要以完整性为代价。但是当人们畏缩而胆怯地不去管一切不可捉摸和比较复杂的东西时,那末能吸引我们去认识自然界的这一渺小部分的究竟又是什么呢?难道这种谨小慎微的努力结果也够得上宇宙理论的美名吗?

我认为,是够得上的;因为,作为理论物理学结构基础的普遍定律,应当对任何自然现象都有效。有了它们,就有可能借助于单纯的演绎得出一切自然过程(包括生命)的描述,也就是说得出关于这些过程的理论,只要这种演绎过程并不太多地超出人类理智能力。因此,物理学家放弃他的世界体系的完整性,倒不是一个什么根本原则性的问题。

物理学家的最高使命是要得到那些普遍的基本定律,由此世界体系就能用单纯的演绎法建立起来。要通向这些定律,没有逻辑的道路,只有通过那种以对经验的共鸣的理解为依据的直觉,才能得到这些定律。由于有这种方法论上的不确定性,人们可以假定,会有许多个同样站得住脚的理论物理体系;这个看法在理论上无疑是正确的。但是,物理学的发展表明,在某一时期,在所有可想到的构造中,总有一个显得别的都高明得多。凡是真正深入研究过这问题的人,都不会否认唯一地决定理论体系的,实际上是现象世界,尽管在现象和它们的理论原理之间并没有逻辑的桥梁;这就是莱布尼兹(Leibnitz)非常中肯地表述过的"先定的和谐"。物理学家往往责备研究认识论者没有给予足够的注意。我认为,几年前马赫和普朗克之间所进行的论战的根源就在于此。

渴望看到这种先定的和谐,是无穷的毅力和耐心的源泉。我们看到,普朗克就是因此而专心致志于这门科学中的最普遍的问题,而不是使自己分心于比较愉快的和容易达到的目标上去。我常常听到同事们试图把他的这种态度归因于非凡的意志力和修养,但我认为这是错误的。促使人们去做这种工作的精神状态是同信仰宗教的人或谈恋爱的人的精神状态相类似的;他们每天的努力并非来自深思熟虑的意向或计划,而是直接来自激情。我们敬爱的普朗克就坐在这里,内心在笑我像孩子一样提着第欧根尼的灯笼闹着玩。我们对他的爱戴不需要作老生常谈的说明。祝愿他对科学的热爱继续照亮他未来的道路,并引导他去解决今天物理学的最重要的问题。这问题是他自己提出来的,并且为了解决这问题他已经做了很多工作。祝他成功地把量子论同电动力学、力学统一于一个单一的逻辑体系里。

Friday, December 04, 2009

在汇款小票丢失以后2

上次写博客是在上周三的时候,周四下午过去,等了四十分钟以后,发现负责这件事情的人没有来.去找大堂经理,一开始她说不在她也没有办法,我就说"我不管你们的谁在不在,我是来找你们银行办事的",然后该大堂经理就带我去找一个比较高级的负责人,李副主任,然后交流下,然后说让我在外面等,等了约40分钟以后,该经理出来说没有找到,让我留个电话回去等.也没有办法了.
这周二的时候,李主任打电话过来说找到了,问我还要不要,我当然说要了,然后就传真给我,这件事情就算arrive milestone了.
其实周四去的时候我还是比较有信心的,无奈那个人不在,然后回来就担心可能对方就把这件事情给忘了,或者是懒得弄了,反正我也忘了拿他们的联系方式,找不了他们.没有想到这人还给我办成了.
下一步的计划就是用这些东西去换收据了.忙啊.
明天应该是生日,周日应该需要同学小聚.

Thursday, December 03, 2009

数据仓库建模_5

Definition  of technical architecture
The technical architecture is the overall plan for what you want the DW/BI system to be when it's ready for serious use .
 It describes the flow of data from the source systems to the decision makers and the transformations and data stores that data goes through along the way. It also specifies the tools, techniques, utilities, and platforms needed to make that flow happen.
We think about metadata as all the information that defines and describes the structures, operations, and contents of the DW/BI system.The DW/BI industry often refers to two main categories of metadata: technical and business.

Technical metadata defines the objects and processes that make up the DW/BI system from a technical perspective. This includes the system metadata that defines the data structures themselves, like tables, fields, data types, indexes, and partitions in the relational engine, and databases, dimensions, measures, and data mining models. In the ETL process, technical metadata defines the sources and targets for a particular task, the transformations (including business rules and data quality screens), and their frequency.
Process metadata describes the results of various operations in the warehouse. In the ETL process, each task logs key data about its execution, such as start time, end time, CPU seconds used, disk reads, disk writes, and rows processed. Similar process metadata is generated when users query the warehouse. This data is initially valuable for troubleshooting the ETL or query process.

Wednesday, December 02, 2009

Note of use delete in kettle

Since in the "delete" the "=" condition  cannot deal with null = null, So if some columns in the comparison are null,the "delete" will be unable to deal with it.


Saturday, November 28, 2009

最可悲的是什么

是你当初喜欢的不染人间烟尘的女子变成了一个菜市场的大妈.

Wednesday, November 25, 2009

在汇款小票丢失以后.

作为一个成年人,遗失汇款小票是一件可耻的事情,但是它发生,总得解决吧.
首先得明确这个汇款小票是无法补开的,但是可以补开一个凭证,其实是一个凭证的复印件.
需要的信息,1,汇款行2,日期(到天就可以了)3,汇款对象的信息,帐号,账户名,开户行4金额5柜员号.
可能对于大家来说柜员号的话是很难得知的,不要紧.如果你记得是那一个窗口,可以找大堂经理查询,如果不记得了也没有关系,可以问当日所有的柜员号信息.(银行内部其实是可以根据对方账户信息查到的,我就是这样查到的,但应该不符合规范)
一,找到这笔业务归属的支行网点,这个可以问大堂经理知道.
二,找到该支行的大堂经理,向他说明情况,他们银行应该是提供这种业务的,然后他会带你去开这个复印件.
三,找到这个人以后,把上面的信息给他,他就可以给查了,但是这个查询是要付费的(我明天才去,还不知道多少钱),如果无法确定是哪一个柜员号,至少把当天的当班的所有柜员号给他,这样可以缩小范围,而且,这种情况下,由于业务量比较大,可能需要说说好话.
这样应该就可以搞定了.
Note,好像现在大堂经理都还比较负责,可能是比较害怕投诉吧.当然最好还是不要撕破脸皮,这样他可以提供规章之外的一些帮助.否则按规章的话是很难办好这件事情的.
尽量让对方多解释,这样的话有时候对方可以提供一些有用的信息,比如我去咨询的时候大堂经理就说如果提供窗口号可以知道是那一个柜员,而窗口是不会变的,这样我们就可以推导出他们是可以提供当天所有的柜员号.
嗯,当然我的例子还只是一个半成功的例子.明天上午或者下午就去执行第三步了.
To be continue.

Sunday, November 22, 2009

H1N1 continue.

Turns out that It was not H1N1 flu but some other respiratory disease. Or
so called bronchitis. Lucky but not most lucky of me .

Sunday, November 15, 2009

H1N1

咳嗽(以前感冒只会流鼻涕),腹泻,头微疼,腹泻过一次.
可能是H1N1了,也可能是回北京风吹得太过,着凉了.郁闷啊.

微软拼音输入法引起的蓝屏问题

key words Windows XP 输入法 无法进入系统 蓝屏 默认输入法 修改 input method blue screen modify 
之前一直用的是Google输入法,Dvorak 键盘布局,昨天有人想用本,装上了微软拼音输入法2003, 并设置为默认输入法,然后出现了无法进入系统的症状,当进入系统的时候winlogon和explorer进程都统统死掉了.安全模式也是一样.然后出现蓝屏00021a hard error.
万幸我还另外装了一个linux系统fedora(之前就已经拯救我于北信源XX网络监视器),搜索了一下,解决方案如下.
安装注册表编辑器chntpw, yum上面有. cd 到%UserProfile%例如C:\Documents and Settings\Administrator下,
chntpw NTUSER.dat
cd Keyboard Layout 
cd Preload
编辑 1 (输入?获得帮助)
改为 00000804 US-en,这个应该没有问题.
退出,现在问题就解决了.
I added a new input method to my Input Method, and set it to be the default input method, terrible thing happened, I can't log into Windows XP any more.It must be some fatal error occurs when the new default input method loads. Luckily I got my Linux OS Fedora.
Following is how the problem is settled.
log in Fedora , install chntpw by yum.
cd to %UserProfile% eg C:\Documents and Settings\Administrator.
chntpw NTUSER.dat
cd Keyboard Layout
cd Preload
'edit' 1 (sorry for forgetting the command,use ? to get help)
change to to 00000804 Us_en.
q
I believe the problem is no longer a pain to you, Wish it helps.

Monday, November 09, 2009

数据仓库建模_4


return on investment (ROI)
net present value (NPV) or internal rate of return (IRR)
The major roles involved in the DW/BI implementation front office, coaches, regular lineup, and special teams

Front Office: Sponsors and Drivers Folks in the front office aren't involved on a daily basis, but they have a significant
influence on the rest of the team.
DW/BI Director/Program Manager


Some team members are assigned to the project full time; others are involved on a
part-time or sporadic basis.

Project managers need strong
organizational and leadership skills to keep everyone involved moving in the same direction.

Business Analyst
The business analyst is responsible for leading the business requirements definition activities and then representing those requirements as the technical architecture,

Metadata Manager


This person is not responsible for inputting all the metadata, but rather is a watchdog to ensure that everyone is contributing their relevant piece,

Special Teams
Technical Architect/Technical Support Specialist Depending on your environment, there may be platform specialists who get involved in early stages of the design to perform resource and capacity planning.

Lead Tester he lead tester develops test methodologies, and confirms that the tests cover the span of the data, processes, and systems.

The DW/BI project needs a detailed, integrated project plan given its complexity, both in terms of tasks and players.


If a single task requires more than two weeks, subtasks should be identified. 我觉得超过1周的都需要.

best project management software package will be worthless unless the time is invested to input and maintain the project plan

tracking the following information for each task
Resources: 
Original estimated effort
Original estimated start date
Original estimated completion date:
Status:
Updated start date
Updated completion date:
Effort to finish:
Late days:
% Completed:
Dependencies:
Also don't forget that ETL development is a classic software development task with five distinct phases: development, unit testing, system testing, acceptance testing, and final rollout with documentation. Inexperienced project managers may focus most of their attention on what turns out to be only the first of the five steps. ETL的测试需要分层级.

Sunday, November 08, 2009

数据仓库建模_3

In reality, scope and justification go hand-in-hand,
划定范围和调整是同时进行的.
Note Project scope should be driven by the business's requirements.
范围是由业务需求决定的.
Each early iteration of your DW/BI program should be limited in scope to
the data resulting from a single business
process.起初的时候不能简单问题复杂化.In other words, start small

Given the lengthy project timeframes associated with complex software
development projects combined with the realities of relatively high rates
of under-delivery, there's
been much interest in the rapid application development movement, often
measured in weeks. 快速开发(通常以周记)


In fact, we suggest that BI team members work in close proximity to the
business so they're readily available and responsive;
开发人员和业务人员更亲近,业务为导向的.

Some development teams have naturally fallen into the trap of creating
analytic or reporting solutions in a vacuum. In most of these situations,
the team worked with a small set of users to extract a limited set of
source data and make it available to solve their unique problems. The
outcome is often a stand-alone data stovepipe that can't be leveraged by
others or worse yet, delivers data that doesn't tie to the organization's
other analytic information.
做分析做成了报表,这不正是我们组现在的问题所在么?因为孤立的获取需求,所以做出来的东西也一个个的孤立了,不能互相使用.


project charter, the document explains the project's focus and motivating
business requirements, objectives, approach,anticipated data and target
users, involved parties and stakeholders, success criteria,assumptions and
risks. It may also be appropriate to explicitly list data and analyses
关注点,需求,客观现状,途径,预计中的数据和目标用户,相关的组织\股东,成功标准(验收标准),前提和风险

P38

Thursday, November 05, 2009

weired way to get return value

I guess shell is always weired. if you run a command in a subroutine in a
script, try return in the command field and then return $? in the
subroutine field. as what following shows .

function isRunnable()
{
typ=$1;
filename=$2;
day_a=$3;
day=${day_a:=''}
cat 'exe_seq_all.dat' | grep "$filename$day$" | while read line
do
arr=($line);
prework=${arr[0]};
if [ ! `grep -c " $prework " 'fin.dat' ` = 0 ]
then
return 0;
fi;
return 1;
done
return $?;

Tuesday, November 03, 2009

recover from rm *

I had a good meal after hard afternoon work. when i came back, I typed rm
* in the cygwin command line. all the work I've done in the afternoon is
gone, Much upset when I learn there is no 'recycle bin' in cygwin.
lucky, after some search on the net , I found this 'File Scavenger' which
help me to restore the *.sh *.dat *.pl *~ from the disk. Thank Que Tek
Consulting.co

Monday, November 02, 2009

note from the data warehouse lifecircle toolkit

Three keys to ensure BI/DW success
1 Strong Senior Business Management Sponsor(s)
2 Compelling Business Motivation
3 Feasibility


DW/BI systems that align with strategic business motivations and
initiatives stand a
good chance of succeeding.

In some sense, any SELECT DISTINCT investigative query on a database
field could be considered data profiling

Thursday, September 24, 2009

Isolation suitable for ETL

The best isolation level for ETL is DIRTY READ, Cauz it wont set any locks on anything , and wont consider lock exists on any anything , as we have a great chance to ETL DATA from a static data base, As a result, It will cost the least and execute in the fastest way.
In informix, the clause is SET ISOLATION TO DIRTY READ.

the following is the isolation performance in ORACLE. 
Read uncommitted is the same as DIRTY READ.

                  Dirty Read     NonRepeatabe Read       Phantom Read 
Read uncommitted  Possible       Possible                Possible
Read committed    not possible   Possible                Possible
Repeatabe read    not possible   not possible            Possible
Seriaizabe        not possible   not possible            not possible

Wednesday, September 16, 2009

change is difficult

I used to confused by the fact people are always not listening, And now I
have exprienced the same things, though it's me that not listening. Why is
that? lack of self-confidence?
Maybe,

Saturday, September 12, 2009

自己在做什么

看到空间里面同学已经在暗自下决心要做hw的对手了.自己在做什么呢?

Tuesday, September 08, 2009

昨夜遭遇

昨天下班回去,一个人乘电梯回房间,先看到电梯从二楼下来.门开了,里面有一个女的,装扮还挺正常,样子也还行,然后很奇怪的没有下电梯(电梯最低1层),我还以为她是要上去,还问她,'要上么?'然后那女的也没有怎么回答,然后我到三楼,她到五楼.然后我就转过身,感觉那个女的一直盯着我看,到二楼了,那女的突然问,你住那个房间.我:'呃~~嗯~~323'.然后那女的又问,是一个人么,可怜我还没有意识到这个女的是干什么的,心中暗感诡异,此时电梯也到三层了,我一边疾出电梯,一边脱口而出,'不是'.走出3,5米远之后我,我突然开始怀疑了.不会吧,我难道碰上了妓女?我已经到了这个年纪了么?会被妓女当成潜在的顾客.悲剧啊,可怜我单身这么多年了.


Thursday, September 03, 2009

cognos sdk dev note

get all the members of an role
the search path is like "expandMembers(CAMID(\"::System Administrators\"))" 
got all the roles under certain namespace "CAMID(\":\")//role"
got the object the querier has read or traverse permission //folder[permission('read') or permission('traverse')]
get all the groups and role a user or set of user belongs to "membership(CAMID(\"dbAuth:u:8888888888\"))"

Wednesday, September 02, 2009

遗失了什么?

郁闷,今天升级opera 10, 把我之前写的一封信的草稿弄没了,虽然一直放在那个地方没有胆敢发出去,但是想要留在那里也许有一天自己回去看.
难道这其实是天意?让我不要纠缠在那里.

Tuesday, September 01, 2009

cognos sdk note

when use the search capability of cmservice in sdk, you can traverse the
name even if you dont have the permission to access that, although you
could not browse it.

The implements of the service interfaces are often ended with 'stub'.

/directory is the path of namespace .

use query(searchPath, properties, sortBy, options)
properties is the enumerator of the properties you want to get.

"execute, read, setPolicy, traverse, write" are the five permission of
NmtokenArrayProp which contain the list of the privaleges.

Monday, August 17, 2009

Cognos SDK log how to set the default type of mdl to verb

When you want to make changes to an existing model, we recommend that you
use verb MDL, but first, you must add the following line to the cogtr.xml
file:
VerbOutput=1

Thursday, August 13, 2009

OH MY GOD

今天无聊..去qq空间去看新闻.看到哥的空间里面更新了相片,一看,oh my god
这个长得这么胖的人谁?他的下巴比头宽....差点认不出来了,难道是伯母的基因遗传给他了.
时间啊,这不过才过去几年而已.已经变化这么大了么?
我只是简简单单的老了罢.

Sunday, August 02, 2009

不想写java代码

看到代码都不想动了.

偶记

昨天到长春啦,第四次还是第五次我已经记不得了.这一次过来是为了上线的时候的工作,应该需要的时间不是很长,一周可以吧.
上次回去的时候其实很高兴,因为又可以见到谁谁谁了,因为又有机会更加接近了.
可是事情就是没有祈愿中的发展,没有办法控制自己,搞得气氛很尴尬.
可能本来就是没有机会的人吧.
昨天看到lie to me 里面说更受欢迎的人在伪装和欺骗上面更有天赋,更懂得如何应付别人,可能自己就把这种应付当回事了....
唉..
然后安徽的项目的确是让某某某去负责了,自己在其中做开发的样子,怨念.
觉得在这个项目组里面已经很能有更快更高的发展了,当然最近的sdk的学习能让自己cognos的技能有很多增强,并不是所有使用cognos的人都有机会去接触使用sdk的.
另外一件值得高兴的事情是18M打电话来谈工作的事情,7.28的事情,也就上周二.但是另一方面,还没有通知我去面试,朋友说正常需要一些时间从电面到当面面试,hope it's true, 否则表示没有通过面试的人的预筛选,多么可悲啊.
有些害怕,自己参加过的面试都是笔试通过了,面试失败,不知道什么原因,是因为表现得没有自信心的样子,或者是不会展示自己?
如果能去18M的话,肯定是职业生涯的一个飞跃,即使工作是做维护兼开发(怀疑是lotus和cognos结合开发),但是会有顶级外企的工作经验.
来长春了,有机会见到孙同学,谁谁谁说需要多些经验...
唉....
也许麦兜响当当可以.之前的两部都还挺好的.
如果去18M不成,意味着要去安徽待3个月.
如果潘同学不合租的话,意味着我要交4K. .....,我挣钱真是太少了.

Sunday, July 26, 2009

cognos sdk dev note

If you use the BI Bus API, you can use optimistic concurrency control to
prevent users from accidentally overwriting each other's changes.
Optimistic concurrency control relies on object versions to detect
conflicting transactions.

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Passing Environment Variables to External Applications

You can use a URL to pass environment variables, such as content locale,
to external applications. For example, if a user clicks a link that drills
through to an external application, the content locale environment
variable can be passed through the URL so that the language the user sees
in the report is the same as the language the user sees in the application.

The mandatory building blocks for a URL that passes environment variables
to an external application are as follows:

b_action=xts.run&m=portal/bridge.xts, parameters required to pass the
values of the environment variables

&c_cmd=<command>, to specify the URL-encoded application command to be
executed.

This URL can include application variables, other than environment
variables, to pass to the external application.

&c_env=<variables.xml>, to specify the location of the XML file that
identifies the environmental variables to be passed.

This file must be located under the templates folder in templates/ps.

&c_mode=<get_or_post>, to specify whether to use the get or post request
method when calling the external application.

When you use the post request method, the HTML page created is UTF-8
encoded. When you use the get request method, the values of the variables
are UTF-8 encoded.
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

After you create your extended applications, you register them in the
applications index and then deploy the Cognos Extended Applications
portlet to your portal. In addition to the .jar files required to run a
Cognos 8 SDK application, the Cognos Extended Applications portlet must be
deployed with the following files:

The Java standard tag library (JSTL) files jstl.jar and standard.jar. By
default, these files are installed in the
installation_location/Cognos/c8/webapps/samples/WEB-INF/lib directory.

The CPS tag library file cps-sdk-jsptags.jar. By default, this file is
installed in the
installation_location/Cognos/c8/webapps/samples/WEB-INF/lib directory.

The CPS tag library descriptor file cps.tld. By default, this file is
installed in the
installation_location/Cognos/c8/webapps/samples/WEB-INF/tld directory.

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Wednesday, July 15, 2009

cognos sdk study note

MDL represents Transformer models in text format and you can view the text
in any text editor. The .py? file is a representation of the model, stored
in default binary format.

You can use MDL in two ways:
to create an MDL model that completely defines a Transformer model
to create an MDL script that manipulates or updates an existing
Transformer model

ach Transformer model must have at least one model object. The model may
also contain one or more of each of the following object types:

Data Source
PowerCube
Cognos Source (a Cognos 8 package)
PowerCubeGroup
Column
PowerCubeGroupCube
Dimension
Subdimension
Level
Signon
Root Category
View
Drill Category
Custom View
Regular Category
Currency Table
Special Category
Currency
Measure
Dimension Calculation Definition

Automate CleanHouse Sample

This script uses the MDL verb CleanHouse, which is equivalent to selecting
the Clean House command on the Tools menu of the Windows interface,
followed by a date in the format YYYYMMDD. This command removes all
categories with a Last Use date less than the specified date.

Tip: To remove all categories in the model, specify a future date.

This script removes all categories in the Products dimension that have a
Last Use date prior to November 1, 2006, and saves the model.
OpenPy "model.pyj"
CleanHouse Dimension "Product" 20061101 SaveMDL "model.mdl"

life sucks......................................

Tuesday, July 14, 2009

transformer documents notes

The Transformer 8.3 executable name is now cogtr.exe on Windows, and cogtr
on UNIX/Linux.

Scale
Decimal values are read into the model based on a scale that you
specify.

You apply security to an entire PowerCube or cube group by setting a
password to restrict access to authorized users.
cogtr.xml seems to be useful.
You can invoke Transformer from the command line to populate models,
create cubes, and perform other actions. How and where these actions are
performed is determined by information that you provide.
Command Line Options

Transformer can perform certain modeling and cube-building tasks from the
Windows, UNIX or Linux command line.

Note: You must invoke the command line utilities from the directory where
the Transformer executable resides. In Transformer 8.3, this is the bin
directory.

This document provides the syntax for the following routine tasks:

creating or updating cubes

updating a model with new categories created during the category
generation process

running a set of batch jobs with different preference settings or input
files

changing the current date setting so that relative time calculations in
batch cube creation are based on a specific date

supplying database signon information

changing the degree of detail for log file messages

opening and deleting checkpoint files

verifying and, if necessary, updating the scales used in MDL model columns
and measures to match those used in the data source

publishing cubes in Content Manager

supporting Cognos 8 prompts

opening specified .mdl files and executing MDL statements

specifying the number of records for a test cube

regenerating categories, and the measure scales used with them, without
building the cube

Sunday, July 12, 2009

Cognos sdk dev log

An action is a request made to the FM Engine. Actions are XML elements
that contain input parameters. Some actions also have output parameters.
Actions are defined in the CR1Behaviors.xml file, available in the
<c8-location>\templates\bmt\Cr1Model directory.
Some actions do not change the state of the model in the Framework Manager
application and are not typically recorded in the action logs. An example
of actions that are not recorded are DBBrowse and Publish. There are also
some actions that are recorded but they do not change the state of the
model. An example of this type of action is DBRelease.

Modifying the Log Status of Actions
You can modify the log status of an action to determine whether or not you
want it to appear in the action logs.

Using the Framework Manager SDK

You can use the Framework Manager SDK to perform all the same metadata
modeling tasks and processes as the Framework Manager application. For
example, you can

import a data source

enhance query subjects with SQL, expressions and filters

create model query subjects to extend value of data source query subjects

create a basic package

publish a package to report authors

cognos SDK学习 手记

The Model Schema

The Model schema validates the model.xml file, the xml representation of
the model. The Model Schema reference contains information about the
elements and attributes in the model.xml file.

The Framework Manager application records all the actions you do that
modify the metadata model. These actions are recorded in action logs.
Action logs are XML files that you can re-use and re-run in the Framework
Manager application. You can use these Framework Manager action logs as
examples to help you create your own action logs for the SDK.

Friday, July 10, 2009

cognos sdk 开发手记

Cognos 8 is a solution designed to address the challenges of
enterprise-scale reporting, analysis, scorecarding, and event notification.

The Cognos 8 architecture is based on a typical three-tiered Web
architecture that consists of
*a Web server
The Cognos 8 Web server tier contains one or more Cognos 8 gateways.
*applications
The Cognos 8 applications tier contains one or more Cognos 8 servers. A
Cognos 8 server runs requests, such as reports, analyses, and queries,
that are forwarded by a gateway.
*data
The Cognos 8 data tier consists of the content store, data sources, and
the metric store.

Service-Oriented Architecture

Developers can use a collection of cross-platform Web services, libraries,
and programming interfaces provided with the SDK API to access the full
functionality of Framework Manager. You can use the Framework Manager SDK
to model metadata and publish packages without the use of the Framework
Manager application.

Script Player is a command line utility that runs previously created
action logs to create or modify models, or to publish packages for report
authors. Script Player can run an entire action log or only the part that
is required for a specific task.

Virtually everything you can do with the Cognos 8 graphical user
interface, you can do using the appropriate API, XML file, or command line
utility.

You can manage authorization in Cognos 8 in the following ways:

You can modify users, groups, and roles as defined by your authentication
provider or in Cognos 8. You cannot create users in Cognos 8.

You can control access to objects in the content store, such as folders,
reports, data sources, and configuration objects through the use of
policies.

You can grant capabilities to users, groups, and roles to control access
to specific functionality in Cognos 8.

You can assign access permissions to certain profiles, limiting Report
Studio functionality to specific users, groups, and roles. For more
information about managing profiles,

Running an Object Using the Schedule


The URLs provide a quick and efficient way to start Cognos 8 components
and open specified content, such as reports, metrics, folders, or pages.

You can use the URLs to

start Cognos 8 components

access a Cognos Connection page

You create an extended application by using the CreateURI, URIParameter,
Cognos8Connect, and EncodeNamespace tags in a JSP document.

Wednesday, June 10, 2009

搬家 >_<

旧房子6.9到期了,算算租了近半年.其中住了2,3个月......(出差真烦),时间到了,房价也要涨回来了.so,找房子.
星期四,心情沮丧,强迫自己不去某个地方,只是为了表示自己不是那么push.然后在路边看到有房屋出租的信息,1200单间,正好符合自己的标准,于是开始看房子.
两间,1200的那个卫生间和之前的那个差不多,pass.另外一个便宜,但是小,pass.
然后就上赶集去看房子啦,两居,1000-1500,牡丹园,打了N多电话,打到信息发布到了5月的了,被喊tooold了.
周末正式开始看房子,一个1400的,房子还行,还算大,没有租过的样子,挺干净,砍价到1350免中介然后去看另外的,虽然只要1000,虽然在二环三环间,但是房间小,比号称的15平缩水一半.
晚上被同住的告知不愿花那么多钱住那么差的地方,郁闷.....
星期六开始找一个人住的地方,看第一个,750,东西挺好,房间只有4平.....缩水一半.恨.
看另外一个键翔园.800一月,隔断,小室内加阳台.感觉还行.同时被告知说还是合租吧.发现该中介还不错,记下了电话,请他吃饭,发现以前做过五星宾馆的门童....
星期一定房子,同住的找的牡丹园的被定了.只好找我找的那个,虽然他们很反复.虽然很混帐,但是需要在9号前找到地方,没有办法.发现房子里面多了个隔断,愤怒,但是被30S告知已经交了定金了就不要妄想降价下来了.
星期二开始,同时被告知需要出差了.想要请人看电影,发现票买不到.恨.然后房子里东西还没有准备好,恨.只好在之前的房间住一个晚上.十点的时候出去看望千里迢迢来北京的张同学.
今天退以前的房子,交接新房子,出差.....
要是有10K的月收入,就可以去租个2500的房子了,也不用弄得这么麻烦了.

Sunday, June 07, 2009

我好bt啊.

在网上偷偷的看别人的痕迹.
每打开一个网页都自我感叹一下自己的bt.

Sunday, May 31, 2009

今天她还没有回我的短信

是我太push了么?有时候会想为什么总是要折磨自己,这个是自己选择的么?

Sunday, May 24, 2009

.

曾经年少的时候想一个人其实也没有什么.但是总是会遇见喜欢的女子.想要做些什么,总是会想很多,很多次,不是想很多东西.如果我是一个有趣的人的话也可以想很多东西出来.别人在评价的时候总是说一个好男人,一个好女人应该具有哪些品质,可是经常他们都忽略了一个重要的点,就是一定要有趣.在古龙的小说里面英雄相逢,或者说江湖人相逢的时候,可以不顾道德,不顾喜好,却常常说'有趣',然后两个人开始喝酒,或许会成为生死之交.所以一个人要是无趣,那他的品行再好,也不过是庙堂之上漂亮的塑像而已. 自己似乎就是这样的一种人.
年少的时候会幻想有种种途径来让自己与天地同寿,终究还是从幻梦中醒过来, 人难免还是要有湮灭的时候,所以就会不再想人生应该如何延长,而是开始想人生如何有意义,然后开始想人生是有意义的么?如果有,那么这个终极意义是什么呢?这是一个被所有有过思想的人所思考过的问题.没有答案,即使有,一个人常常只能被自己的答案所说服.我能找到自己的答案吗?
不过现在明白了,人类不过一种渺小的生物,每个一个独立的人既然是它的一个个体,必然承载了它的缺陷,即他是存在于这个物质世界,是具有能量的,然而也正是这一点,才有了人类的存在,所以说存在本身并不是根本性的,是没有必然性的.所有词句的意义都在于对于主题即对象的描述,既然没有了主体,所以用于描述的,即意义是不存在的.
既然认为人生是没有意义的,那么任何一种行为都是没有意义.对现在的我来说,生活不是无厘头,必然要找到一个意义去吸引我,去驱使我去行动.
当然还是会有行动的时候,人还是会存有感性的时候,对于陌生人的赞美,对于熟悉的人的斥责都会有或喜或悲的反应.总会对物事人有喜好有厌弃.....
有时候简单的执着是没有用的,就像一个傻子一遍遍的用一根棍子在河里网鱼,是不可能有成果的....
虽然自己明白这一点,但是自己是如此的一个无聊人,又有什么办法可以改变呢?
所以会听到莫文蔚在真的吗里面唱I miss you I miss you I miss baby everyday,...I love you I love you I love you baby everyday的时候,真的是眼泪都要流下来了.

Wednesday, April 22, 2009

Visited the First Auto Work@china

昨天就隐隐约约听到说今天要去参加与汽车相关的活动.今天得知是去一汽参观.倒是有一点吃惊.
车程大概约20-30分钟,然后在一汽的一位同志带领下,参观了一汽的某装配车间.No
photo.
主要参观的是Audi的装配线.也就是从各个零部件装配成整车的流水线.整个装配线共有86个步骤,装配线长度约500米,当然没有算上转弯的部分.每个工序约3分钟时间,需要1-4个工人的工作.整个装配线装配需时约四个小时,整个装配线现在的出货量约是420量,设计年产量为200,000辆/年.
装配工作首先由车身开始,由后到前,由内到外,由上至下.车后盖->内部线->前部->玻璃窗->加上底盘->装上车窗.整个过程主要是手工完成,使用的机器人极少.其中观察到几个有趣的现象,工人师傅用较有韧性的纸条检测后盖的紧密程度,用一个小方块检测车顶的平整?弯曲程度.感觉真是简单而使用的小技巧.装配线上的工人大多都是年轻的小伙子,女性和年纪大的不是很容易见到.另外会同时有不同颜色的同一型号的车从同一装配线下来.
嗯,基本上就是这些.整个车间还是很洁净,声音也不大.不知道是因为生产线是德国生产的原因,车间内挂的标语,警示都有德语标识.
总之,参观完了,感到比较单调,没有什么特别的冲击力,回来的时候看到新中国第一辆解放牌卡车纪念碑,真是让人感叹啊.

Friday, April 10, 2009

老了么?

试了大航海时代,diablo2, 信长之野望,似乎对游戏没有兴趣了?

Wednesday, April 08, 2009

开始使用linux

其实之前一直没有激励使用linux,最近学了点perl,觉得使用linux的话比较原生态,功能更加强大。正好自己也有了新的笔记本,就装了fedora。
另外提一点就是现在的笔记本好多都是直接提供的vista,感觉特别麻烦,联想的XP的SATA驱动又不好用,勉强装了用着。
选择的是发fedora10的版本。
装好后首先 将用户加到sudoers里面,然后就 yum update了。
然后装了opera,python输入法,然后就来写这个了。
发现可以用linux下面的rdesktop远程登陆,感觉很好用阿。
不过linux还是会有莫名其妙的问题,字符编码也永远是问题。
其它的还好,用得挺习惯的,键盘布局这个挺爽,这样就不用每换一个输入法就设置一个键盘布局了。dvorak也很简单实现了。

Saturday, January 03, 2009

2008总结

1月,大四上结束,没有参加马拉松.
2月,回家.
3月-4月,做毕业设计的项目.
4月-5月,去广州找工作并继续做项目.
6月-7月,毕业+换工作.
7月-12月,开始职业生涯.