2009年3月22日星期日

If there is a dependency, Make it explicit

In "work effectively with legacy code", there is one chapter describing how to put a class, which uses a singleton object's method in its contructor, into test harness. The author argues that singleton(or global data) makes the code opacity and hard to understand. I didn't quite catch the point when I first read that. Until recently, when I tried to read a program's source code and write some documents for it, I began to understand the author's point, and found it can't be more right.

The program read some data from server, organized them and showed them in widgets. I wanted to find out what kind of data need when initializing a table. The widget class had not parameters in its constructor and not a member data which was like from server. But it had a method named "PopulateTable". It seemed to be a good place to start. I checked this method, and found it query some data through APIs in a global namespace, following it are a bunch of methods, which seemed like to process data and set up the widget. I thought that's all and continued the program. However the program stopped in a breakpoint in communication module, it really surprised me. I checked the stack, and found I miss a method named "SetXXXXX" in "PopulateTable", which took no parameter and queryed an additional data through another API in the global name space. Thank god, I had put a breakpoint in the communication module and it was the first time the program loaded the data, which would be cached in local memory and the program would had never meet the breakpoint then. Luck is the last thing we want to rely on, right? So I had to go back to the class, and searched every place where the global name space were used. And thank god again, it's just an easy case. If the class had many other classes, I had to go through all of them...

This is an example of that singleton, global API, or global data make the code hard to understand. They hide the dependency of a class in the implementation detail and make it impossible to figure out what this class needs or what this class do to the data outside. Think about an alternative design for this class, it reads the data explicitely from its PopulatesTable method, or it has a data class as parameter of its constructor, which has explicite methods to push the data into the widget class. Is it much more clear and easier to understand the code?

So what's the problem of global things? The most serious problem is that it makes developer lazy. If I have a global interface, why do I bother to add a parameter to a method or a class, not mention to think about the abstraction? As a consequence, the implementation detail lies everywhere, the code has no layers, every part of code are coupled together. Then it becomes a nightmare for the person who maintains them. Want to modify some functions(I can't understand the code in a short time)? Want to put them into a test harness( I want to have different kinds of data to test several boudary condition. But the class get the data from a global interface? How can I mock the data?)? Want to extend the function( I want to add another way to process data, but the class use some global interfaces to process data, how can I effectively reuse the original code?)? etc... Each kind of this question will kill the person.

There is another subtle implicit dependcy we don't notice, the events in GUI.
I once added a method to a plot class, the method loaded the data from file and show it. The plot had another method which would recalculate the plot data from some results. But the plot loaded from local files didn't have such kind of results, it had to query them from its parent window. So I had to add this function. "Dependency is bad", I thought. So I decided to post a event up, and the plot needed not to know who's its parent.
Did this plot depend on its parent class? It seemed not. But in fact it did. It depended on the way parent window responding to the event to guarantee it was in a good state. The very bad side of this method is it make dependency not obvious. If we don't notice this contraints and change its parent window to one that doesn't care this event, compiler can't give us any error or warning for that, then we would really be in trouble for inducing some subtle bugs.
I think about the reason for the decision using the event at that time. I find it's just a excuse that it's for avoiding the dependency. The truth is the parent window's header file are included by many other cpp files, I just didn't want to cost the compile time. Yes, it was just because I was lazy that I made that decision. It did nothing about breaking dependency. If I really wanted to break dependency, I would try to write a controller class to coordinate the data between the parent and the plot.
So think twice when you use event as a communication method between widget. In my opinion, a controller is always a better choice, it shows obvious and meaningful dependency.( If many widgets have dependency on each other, at most of time it's a sign that these widget classes break the SRP, you have to split the business logic out of them, and make the widget classes just be repsonsible for showing something)

Too many dependencies among the classes are bad, but introducing the implicit depencies do much more harm. So when there exists a dependency, make it as explicit as possible. It's not fun to surprise the person who read you code, let along to surprise the user with bug introduced by the implicit dependencies.

2009年3月1日星期日

《敏捷软件开发》--设计的臭味和OO设计的原则

软件的最大特点就是容易变化。需求就像一个移动的靶子,所以永远不要期望最初的一个设计能够始终的击中靶心,即使最开始的时候看起来有多么完美。为了应对不断的变化,那么需要我们的设计始终保持尽可能的轻量和明晰。

臭味
对于一个想要保持整洁的人,首先要搞明白的就是什么是不整洁的,什么是腐坏的。软件也是一样的,一个坏的设计总是会有一些臭味。所以第一步就是要闻出这些臭味。
1.僵化,当一个改动影响到多个你都想不出能扯上什么关系的地方的时候
2.脆弱,一个改动结果导致了逻辑上一点关系都没有的模块出现问题
3.牢固,你想要复制一下一个钟的计时的部分,结果发现它和钟的报时那个部分有着千丝万缕的联系。。。
4.粘滞,保持设计的做法比破坏原有设计的做法难,成本高
5.不必要的复杂,这个复杂性真的需要吗?
6.不必要的重复,多处基本相同的代码
7.晦涩,代码应该清晰并且有变现力,而不是一部悬疑小说
一个好的开发模式应该是对这些臭味零容忍的,当遇到一个臭味的时候,即使是最轻微的,也不能放过,因为一旦臭味聚集起来了,整个系统腐化的时候,那么要花的时间和代价都是要比现在要大的多。

OO设计的一些原则
我们通过一些原则来指导日常的行为以保持整洁,像不乱扔纸屑,不随地吐痰之类的。对于软件设计来说,同样可以通过遵守一些最基本的原则来避免这些臭味。事实上,很多臭味也是因为设计上这样那样的违反了这些原则而产生的。
1. 首要的原则
pragmatic
保持软件的尽可能的简洁和健壮。实际设计中,引入一个结构的时候总是要问一个问题,“这个是需要的吗?”
今天制造了混论,那么今天就要消除掉混乱
2. 单一职责原则(SRP)
在Unix的设计哲学中,一个程序应该满足:do one thing and do it well. 这个是分割整个复杂性的一个有效的方法。对于OO设计中的类也同样遵守这样的一个原则:一个类应该只负责一种职责 。
什么是职责?一个职责就是一个变化的原因。一个类应该只有在某一个因素变化的情况下而需要改变。这样的类具有高度的内聚性,并且也降低了不同职责(不相关的因素)之间的耦合性,这样使整个系统更为的清晰。
需要注意的是,这些变化因素的粒度是由实际需求控制的。如果两个职责在应用程序中总是同时变化的,那么这两个职责可以认为是同一个职责,应该将他们并在同 一个类中去。另外一个职责应该是实际需求中确实可能存在变化的因素,如果没有任何征兆显示出变化的可能性,而滥用任何的原则那么都是不明智的。这样可能引 入不必要的复杂性。
3. 开放封闭的原则(OCP)
一个模块应该具有这两个特征,a)open for extension. b) close for modification.
OO中通过抽象来隔绝变化,客户类面对的是一个稳定的抽象基类,它提供稳定的接口。而实际实现的具体类则是隐藏在这个稳定的抽象背后,那么所有的改变都可以是封闭的,新的功能通过新建一个具体类来提供
实际中我们不可能对所有的变化进行封闭,因此我们需要通过对实际问题的研究和分析,以选择需要封闭的操作。
另外,应用一个好的OCP原则是需要比较高的成本的,毕竟设计一个好的抽象接口不是容易的事情,因此只有确定需要应对改变的地方才值得去应用这个原则,或 者是只有第一次变化到来的时候才考虑使用。宁愿被第一个子弹射中,确保自己不被同一只枪发射的子弹再次射中。
4. LSP替换原则
子类型必须能够被替换掉它们的基类。
一个容易被忽视的方面是如何理解这个“可以替换”。这个替换不能孤立的来看的,而是要在应用程序的上下文的语境中来思考。例如一个Rectangle类和 一个Square类,孤立来看Square继承Rectangle似乎没有什么问题,但是如果在应用程序中对Rectangle存在这样一个假设:长,宽 是可以单独改变的。那么这个替换就是不合理的了,在这个语境之下Square类不应该成为Rectangle类的子类。
因此,在OOD中is-a的原则是基于行为方式而言的,行为方式一致性是基于客户程序中的假设决定的。可以通过函数设计中的precondition和 postcondition来对基类的行为进行约束。那么对于一个派生类的相同的行为接口,只能使用和原来的一致或者更为宽松的 precondition,使用和原来的一致或者更为严格的postcondition(一句话,输入接受一切基类的precondition,输出不能 违背基类的postcondition),这样这个派生类对于用户是透明的,用户的操作不会因为使用哪个派生类而受到影响。
5. 依赖导致原则(DIP)
模块之间不应该有直接的依赖,它们都依赖于抽象,通过抽象进行耦合
抽象不应该依赖于细节,细节依赖于抽象
我们需要摆脱通常的那种观点,一个工具库应该拥有自己的接口。应用了DIP原则时候,客户拥有了抽象接口,提供它们需要的一些服务入口,它们的服务接口则 从这些接口进行派生。高层的业务策略不应该依赖于下层的细节,而是高层的业务策略提供自己需要的一组接口,下层的具体细节实现这些接口。
通过依赖于抽象,可以构建一个更为灵活和重用性更高的一个系统,他是OO设计中一个基本的底层机制。具体总是通过稳定的抽象耦合在一起。
6. 接口隔离原则(ISP)
客户应该只看见自己需要的接口
实际应用中存在一些具体对象,它们不需要具有高内聚性的接口。比如这个类的几种因素在实际中是很强的耦合在一起的。但是如果这个类被几个不同的客户程序所使用,那么对于一个客户就受到了很多不相关的接口的污染,而且隐藏的将几个不想关的客户程序耦合起来了。
接口隔离原则建议对于一个客户程序应该只看到自己需要的那个具有内聚接口的抽象基类。这个是基于依赖导致原则中,接口应该是基于客户程序需求的,并被客户程序拥有的。

原则和设计模式
刚开始设计和编码的时候就应用设计模式可能会存在过设计的问题,因为当时的可见的常常并不能帮助回答这个是否需要的问题。因此比较赞同作者的观点,就是在实 际的设计和编码过程中,先从一个最简单的结构开始,在不断的重构中去解决那些有悖于上面原则的过程,这样可能最后发现设计或者代码已经接近一个特定的模式 了,最后再把类的名字改成模式的名字,并且重整代码成为更为正规形式来使用模式。这样,代码就回归为模式。
所以对于一个设计者来说,更重要的是对这些原则有着很好的理解和把握,从来不会把自己局限于某个最初的特定的结构中去,设计和代码总是在重构以去除那些已有的臭味中不断的往简洁和健壮的方向演化。最后应用的模式更多是基于能够更为清晰和快速的表述现有代码结构的考虑。