转贴：最佳J2EE方案讨论之O-R Mapping： hibernate v.s. CMP，请大家讨论

“我碰到的情况是一个制造行业的ERP软件，80多个表还是极端简化的了，原来则是150多张表，再怎么组件化，还是很难避免需要将几十个表之间需要连接查询的情况。”
Agree!
This is very common in ERP app, at least 1/3 of my clients use more than 100 tables, a big portion of them are “infrastructure type” supporting tables, in practice, very hard to separate them from others, also due to nature of incremental development, customers will keep adding new cross-cutting concerns to the model, so they always prefer to rebuild/redeploy all components to avoid any inconsistency. I think everyone should agree in this situation it’s insane to make every table into CMP entity bean, most of them should just be POJO

“据我了解，同样的需求，用.net开发效率是j2ee的三四倍。”
Don’t agree
Maybe you are not aware, for many years in the past, there are many j2EE based RAD tools ( SliverStream, Visionbuilder, OptimalJ etc ), they maybe bit proprietory but very successful to enhance user’s productivity. Most of large projects done in that way. Of course, for simply web project, you can do whatever you want, out of the discussion here.

“对于中国平均程序员只有26左右来说，理解掌握这些是不可能的事。”
There are too many discussions about why software in China has not been taking off, I think this is the most important factor. Good programmers should stay and enjoying their work. I have attended Javaone for some many years, each year there are 15,000 to 25,000 participants, I can tell majority are in 30s, there are many people in 40s or 50s attending technical sessions.

Cheers
-Jevang

Jevang，你好！你真的自己写了一个O/R Mapping吗？佩服啊！

“据我了解，同样的需求，用.net开发效率是j2ee的三四倍。”

我并没有吹牛，我可以写一篇很详细的文章从.net和j2ee的每个方面来分析这个问题，证实我的说法，等以后我写完了这篇文章再share给大家。

banq版主，你这些观点：
在一个集群环境中，尽量少使用http session因为随着服务器台数增加，复制这些session会浪费耗费性能，最好是Web层无状态，将状态实现在有状态Session bean中，一般一个系统只需要一个，利用EJB本身的集群机制（不是简单复制session），可以大道很好的集群性能

我不能够赞同，“EJB本身的集群机制”是如何实现SFSB的同步呢？说到底还是要把Bean对象序列化到另一台机器上，这其实就是内存复制。

别的App Server的实现我不清楚，但是BEA Weblogic就是这样做的，Http Session也罢，SFSB也罢统统都是内存复制。

《J2EE应用与Weblogic Server》这本书是BEA公司官方推荐的Weblogic教材，由BEA公司的3个工程师撰写，够权威了吧，书店里面到处有卖的。上面详细的说明了HttpSession和SFSB在Cluster中的实现，由于这两者操作都一样，我就一起来说明。

这本书上面明确写了SFSB参加集群是为了“故障切换”，而且这种内存复制并非像你所想像的那样像网络广播一样复制到Cluster中的其他所有机器，而是Master-Slave模式，只是由Master通过序列化机制复制到Slave上，而不会传播到Cluster中的其他机器上。当Master宕机后，Slave升级为Master，Cluster中再找一台机器做Slave。

那么你说到底是序列化并传输HttpSession开销大呢？还是序列化并传输SFSB开销大呢？

另外我一贯反对言j2ee必称cluster，其实我们能有多少机会接触Cluster呢？至少我还从来没有做过集群的项目，能把App Server进行集群的项目，合同金额不会少于1000万人民币，所以少谈点集群，比较符合实际，至于什么“自动在上百台App Server上进行负载均衡”，除了Google网站，我还真想不到有谁会用那么多机器。

> 不是到hibernate和apache项目组的ojb相比如何

在我决定选择一种O/R Mapping工具的时候，我在Hibernate和OJB之间权衡了一下，当我把两个都下载了之后，发现Hibernate的文档特别全，而OJB文档太简略，非常少，所以我就毫不犹豫选择了Hibernate :)

OJB计划在2.0全面支持JDO，目前只是很有限的支持，那么我估计到OJB2.0的时候API恐怕会有比较大的变动，所以目前最好不要用OJB，等它发展稳定了再说吧。

to robbin

看了你写这些文章，非常认同
对于CMP，实在是一个鸡肋，我们给银行做一个项目
从开始就有BEA的高手很我们一起做设计
最终，完全按照SLSB+BMP/CMP把设计做出来了
印象中，做CMP/CMR的设计简直就是噩梦，我做软件5年了
从VC到JAVA到J2EE，这个项目是我做过的最复杂的一个系统
几百个EJB组件（至少500个），纠集了10几个J2EE的好手
做下以后，大家私下里一致意见就是：
为什么这个世界上会有这么狗屁的CMP出来

如果以我们的能力，完成同样的事情，不用CMP/CMR的话
至少要省一半的脑子和时间

再 to robbin

我现在在做的一个项目刚刚处于设计阶段
我有点想用hibernate，但我对它确实是刚刚接触
有点不敢贸然行动
我的一个问题是，我的项目目前已经有300多张表被设计出来了
如果我用HB的话
就好象CMP完成不了复杂的SQL查询一样，
会不会有什么HB难以完成的东西？

Hi Robin, since you raise the question, I am taking this opportunity to do a bit of marketing.
Topas is my product, positioned as a RAD and powerful Java development environment. I did not build features based on spec, it's the other way around, it's a aggregation of useful features from many products/projects and rearchitecutre & reimplement in the way that is more comformed to j2EE fashions. It has a transparent persistence service, but I won't label it a "O-R Mapping", it does not support inheritance, one to many, many to one this kind of stuffs out of box, as I never a big fun of fancy O-R mapping. I acknowledge that they are useful in certain cases, but you can workaround in different ways, for my 8 years in app server industry, only in occasion sophisticated mapping become crucial issue.
I did my Persistent engine in the first half of 2002, as I knew EJB just won't be ready for large system any time soon. today I probably be more hesitate to do it myself, integration maybe make more sense.
Since then my development focus has been shift to other things: XML modeling,fully lifecycle development, Business rules, Presentation, JMS integration etc. This year is about massive XML processing and Web service, less innovative then my work in earlier days, but it gives topas a wider coverage on the needs for enterprise, EAI.

Regarding java vs. .net , I can assure you for desktop application, jbuilder+topas can develop simple or complex multi tier java app that match if not exceed the productivity of any .net approach. Due to lack of resource and my skill vacancy on UI tooling, I won't claim that for Topas on HTML app, but there are other comparable products which take similar approach as topas for JSP and with good UI builder, their presentation development process is fascinating.

Java is not just SUN, IBM, BEA, Jbuilder + open sources, it's a whole industry, there are many good things which may not be popular in China due to marketing, licensing, supporting reason, but they have a wide customer base in west( I want to differentiate customer from user who may not pay a dime) and at any given category, enjoy the technical lead over M$, or at least able to counter attack quickly. of course, it takes time to sort out those things out, that's why we spending time on line over forums, blogs.

Cheers
-Jevang

Oh,
btw, Topas is ready for eval, if anyone is looking for development product, feel free to contact me.

-Jevang

Sorry for temporarily hijack this thread to promote topas, now back to main stream topic.
Since several people mentioned "Cluster" here, I think we need to clarify two concepts in clustering: load balance and fail-over. Load balancing is used almost in every production site I see. While failure, which requests more overhead due to replication, is very rarely used. for load balance, either http or SB, does not matter, but for failure, since EJB dictate more standard on object life cycle management( activation, deactivation etc), while http session is more vendor specific, fail-over at SB layer is more common.

-Jevang

> 会不会有什么HB难以完成的东西？

当然有了，其实Hibernate只支持对数据库表中数据的操作，和对sequence的有限支持，其他的数据库对象全部都不支持。比如说view，store procedure，function，struct等等，这些数据库的高级功能都不支持。另外它也不支持DDL，比如说create table, drop table, truncate table,alter table add index 等等，它只支持DML。

由于Hibernate只是封装了对表的操作，所以它只支持select, insert, update和delete。不过值得一提的是select的几乎所有你能想到的语法Hibernate都支持，包含子查询，连接查询，统计函数等等。此外Hibernate映射复杂的表关联做的很出色。

但是我觉得Hibernate最独具特色的地方是灵活的多个类同时与一个表的映射，比如说可以父类子类都映射一个表，可以一个类映射表的某几个字段等等，这些功能特别有用处。Hibernate文档里面举了一个例子：雇员表，保存雇员信息，你可以用一个类映射，雇员表中有两个字段分别是雇员的first name和family name，那么你还可以用一个Name class映射这两个字段，表示雇员的全名。

Hibernate也支持Blob和Clob。

但是如果你要使用数据库view，调用store procedure，这些不是表的数据库对象，就只能直接写JDBC代码了。

J2EE应用与Weblogic Server这本书提及的Master-slave集群模式已经过时，JBoss采取的是你说的广播机制，是JNDI群。这个最新技术资料都在那里。

集群很有用，就比如现在这个网站有时我就让两台机器同时Run，netsh.com这个网站就有10台服务器在跑，你别看好像这个网站没有名气，访问量很大，最高是一天500万pageView，最笨的办法我们是采取sh.netsh.com bj.netsh.com这样让用户主动分类，就象联众游戏采取服务器划分房间的做法，这些在集群技术下都会淘汰，用户只要键入jdon.com一个域名就会和几百万的人互动。

互联网带给我们的不只是技术变化，更是极端并发访问量的考验，我想我对并发性能的关注是有由来的，用J2EE就是看中其成熟的集群分布式技术，否则就象你说的，有那么多快的选择为什么不用？

所以谈到J2EE技术，势必就要关心其集群环境下如何表现，因为这些是J2EE和Java的特长和诞生的目的，我永远不会忘记这点。

还有.net的效率和J2EE开发比是不能够接受的，特别是用.net实现的Petstore与java的Petstore更是不人道的，这些国外早就讨论过，你可以到SUN网站查阅。
在一个多人合作项目中,J2EE无疑现在证明是有效率的，因为在大系统中要实现开发快，已经不是单个人动作的快慢，而是效率问题，所以软件工程主要是为了出效率，效率高，项目就快，而J2EE在这方面无疑是最有效率的，当然相信随着实践，.NET在大型项目应该会有效率，但是这种理论依据我目前没有看到，而J2EE的理论依据我们已经说得很详细，多层结构，分离、分派，再通过XML配置组装，我真不知道.NET除了走这条路，他还有什么新的高招，这就是中庸的微软，所以获得中庸的国人的喜爱。

uu_snow，从项目管理的角度来说，未经熟练掌握和实践证明的技术，在项目中不应该轻易的采用。

我建议你可以这里来做：先设计好DAO接口，然后用你的team最熟悉的技术编程实现DAO，然后在项目有空闲的情况下，拿出其中一个不重要的模块的DAO，用Hibernate重新实现一遍。然后进行测试。

如果确实Hibernate能够满足你的要求，再把其他DAO接口逐步用Hibernate另外实现一套。

部署的时候还是先上你原来的技术。然后采用Hibernate做为升级测试版。经过正个软件开发周期的测试后，最后再考虑全面用Hibernate替换。

一、关于Cluster

Jevang，banq，我对Cluster确实没有什么研究，虽然没有发言权，但是想谈谈自己的想法。

二、关于.net的开发效率问题

Jevang，我并不是指开发desktop application，我指的是做架构完整的Web应用。banq，我也并不是因为TMC的.net petstore vs j2ee petstore评测报告来认定.net开发效率高的。对于TMC的这个评测，基本上是毫无价值的。

其实我觉得去亲手做做就知道了，举一个例子来说：

在j2ee中开发分布式组件，我想大家肯定都会采用EJB，不会直接选择RMI编程了吧，但即使开发一个SLSB，也需要写一个Bean，两个接口，和1个或更多的XML格式或其他格式的EJB配置文件吧，如果你没有一个自动化程度高的支持EJB代码和配置文件自动生成的IDE工具，EJB开发绝对不是轻松的事情。无论如果你都不能否认开发一个EJB起码要比开发一个Java Class在难度上要高很多，而在开发效率上要低很多吧，在调试上要困难很多吧，在概念的复杂程度上要高很多吧。

而对于客户端调用这个EJB，也不能像new 一个class那么轻松对吧？要用JNDI查找得到Home接口的stub类，然后create方法调用再得到Remote接口的stub类，然后就可以像使用一个普通class那样来用了，但仍然要处理RemoteException。

可是.net中开发分布式组件，就像开发普通的Java Class完全一样，调用分布式组件就像new一个class一样。和你写一个普通的Java class完全没有任何分别。

服务端组件存放在IIS的一个Web App的bin目录下(类似于Java Web App的WEB-INF\classes)，如果你想把某个.net Assembly(概念上等同Java的class)变成分布式组件，那么你需要做的唯一的事情就是在web.config(类似于web.xml)里面加一行，声明这个.net assembly为远程组件。剩下的工作全都交给ASP.net来完成了。

而客户端调用远程组件与之类似，需要在某调用代码的配置文件里面声明用的组件不是本地的，而是来自于某个URL。然后客户端new一个远程class，就可以用了。

所以在.net中，没有开发远程组件的概念，因为一个组件是本地组件还是远程组件在代码上是没有任何分别，只是在你部署的时候，根据情况指定它是否是一个分布式组件。

想想在Java中开发，配置，部署，和客户端调用EJB的过程，然后再看看.net中仅仅是写一个普通的cs程序(等于Java程序)，然后在服务端配置文件中加一行声明，在客户端配置文件加一行声明，就可以new出来，调用了。开发效率是不是差很多啊？

Hi Robin,
I don't want to be defensive, but I still want to make things little bit clear.
一、关于Cluster

Jevang，banq，我对Cluster确实没有什么研究，虽然没有发言权，但是想谈谈自己的想法。

在我看来，实现load balance并非要用Cluster，DNS循环解析就是一个简单易行的load balance，Yahoo就是这样做的。而且做load balance我觉得更多的是网络软硬件环境的支持，而不是Web Server或者App Server来实现的。做Cluster并不是把几个JBOSS或者WebLogic配置好就行了，主要还要涉及很多网络存储的问题，所以我觉得这是另外一个领域的问题。
<< you are right about hardware support, but I though we are just talking about JVM based app servers, so let's stay in this topic. My co-worker wrote an excellent paper about GC( actually IBM also has lots of Redbooks on this subject, only GC will be enough reason to force people to use multiple VM on multi-processor boxes. >>

二、关于.net的开发效率问题

Jevang，我并不是指开发desktop application，我指的是做架构完整的Web应用。
<< That's what I am talking about, full life cycle, multi-tier app, Business logic, persistence, Modeling,DB schema gen/reeng, code gen, appserver deployment etc. Reason I want to point out desktop app is I want to separate it from HTML based presentation development in topas, before topas integrated with some nice UI tool( dreamweaver e.g), it's relying on text editor to build JSP page( though I use visualstruts tools for flow control), can't compete .NET studio, nor to many solid JSP studios. >>

banq，我也并不是因为TMC的.net petstore vs j2ee petstore评测报告来认定.net开发效率高的。对于TMC的这个评测，基本上是毫无价值的。

其实我觉得去亲手做做就知道了，举一个例子来说：

而对于客户端调用这个EJB，也不能像new 一个class那么轻松对吧？要用JNDI查找得到Home接口的stub类，然后create方法调用再得到Remote接口的stub类，然后就可以像使用一个普通class那样来用了，但仍然要处理RemoteException。
<< Do you mean J2EE == "Anything has to be CMP EJB"? if yes, I don't agree with your definition, but admit you make a valid point, except I will say things are changing very fast. it's just getting better every few months. Ago, don't limit your choice to just few vendors.
If you open up your definiton about J2EE application, which is I think you should anyway, then things are better brighter even as today. e.g. In topas, all business objects by default are just POJO, unless you declare them to be EJB or publish it as webservices. Want to remote access the data for those BO, you get it thru a standard client side lib >>

可是.net中开发分布式组件，就像开发普通的Java Class完全一样，调用分布式组件就像new一个class一样。和你写一个普通的Java class完全没有任何分别。

想想在Java中开发，配置，部署，和客户端调用EJB的过程，然后再看看.net中仅仅是写一个普通的cs程序(等于Java程序)，然后在服务端配置文件中加一行声明，在客户端配置文件加一行声明，就可以new出来，调用了。开发效率是不是差很多啊？
<< Though i still don't believe .NET has the upper hand, but I like the comparison you made here, seems very objective to me. It's a close game between JAVA vs. NET in term easy of use for small to medium size app dev, that's why I want to integrate .NET with Topas Server tier thru Web services.

For large enterprise system dev, on which my eyes are, we have to think about efficiency from different perspective, it's no longer about drag-drop-click and bingo. I guess we'd better start a new thread for that debate. >>

Nice to talk to you
-Jevang

Jevang,你好！

1、Cluster

在多CPU机器上，启动多个VM进行Cluster确实是一个很有力的证据，我承认我对Cluster认识太片面了。

2、分布式

比较分布式组件的开发效率我只提到EJB，是因为我觉得在j2ee中使用EJB来实现分布式组件是应用比较广泛的，用它来和.net的分布式组件做对比具有普遍意义。

用Web Services实现分布式组件也是一种方案，但是j2ee在Web Services领域落后于.net。.net XML Web Services开发起来很简单，写好一个cs，后缀为asmx，用浏览器来访问一次，ASP.net会自动编译cs，然后自动发布为XML Web Services，然后再自动生成WSDL，还会生成测试Web Services的ASP页面。

而Java世界里面，每个App Server的厂商都有自己的一套实现Web Services的办法，此外还有一些中间件可以选择，像Apache Axis，GLUE。Axis上开发Web Services的方式很像.net XML Web Services，但是性能差.net很多，大概是MS的XML解析器的性能比较高的缘故吧。j2ee1.4spec将统一Web Services的Java实现，不过现在spec还是final draft，估计要到10月份才正式发布，而等到各个App Server跟进，还不知道要到何年何月。同时这也是很多Java技术面临的同样问题，比如JDO，讨论了4年半1.0spec才姗姗来迟，2.0还没有启动。(这也是我宁愿用Hibernate，而不用JDO的原因)，为了取得大多数人的共识，为了照顾某些大厂商的利益，规范的出台需要漫长的等待和长期的争论，而抢占市场占有率的大好时机都这样被白白错失了。

3、容器提供的组件管理功能

在j2ee中，普通的Java组件和EJB是泾渭分明的。普通Java组件是由JVM创建，GC回收，对象由JVM来管理；而EJB则不直接由JVM管理，而是由Container来管理的，Container负责Bean实例，激活，销毁，池化，事务控制，安全性检查等等。所以EJB集分布式、容器事务控制、安全性检查，池化的功能于一体，容器不提供对普通Java组件的功能支持。这样的设计可以极大的简化厂商对于容器的实现。

在.net中，容器事务控制，对象池化，分布式这些功能都是分开实现的。当你需要给某个cs增加容器事务的时候，在cs的属性信息文件里面写上声明即可；如果你需要对普通的heavyweight对象进行池化，也是在cs的属性信息文件里面声明，当该cs编译好的IL被第一次调用的时候自动被注册到COM+中；如果你需要cs同时具备以上所有的功能，那么也照此办理。从C代码上来看，就是一个普通的cs程序而已。所以即使入门级的C程序员也可以以极高的速度开发那些在j2ee中只有EJB才支持的高级功能。

4、项目的开发效率

即使排除使用EJB，就单个功能组件代码编写效率来看，C仍然要高出一筹。不过就整个项目的开发来说，涉及到很多方面，就像你说的那样：

“full life cycle, multi-tier app, Business logic, persistence, Modeling,DB schema gen/reeng, code gen, appserver deployment etc”，“For large enterprise system dev, on which my eyes are, we have to think about efficiency from different perspective, it's no longer about drag-drop-click and bingo”

就比较复杂了，恐怕很难得出结论，j2ee项目开发效率更高还是.net项目开发效率更高。大概只有把讨论范围限制到某个特定业务领域的、需求确定的项目，才能讨论出结果。

我觉得目前j2ee特别突出的优势是资源极为丰富，有大量的Opensource，同时还是free的优秀的工具、框架供我们来使用，而且软件工程整个生命周期的所有的工具都有，完全可以组合成一个全自动的开发框架，从设计到编码到版本控制，从编码到编译，从编译到打包发布，从打包发布到自动测试，全自动流水线操作。

。而.net光靠MS提供的类库是显然远远不够的，虽然现在很多工具再往.net上面移植(包括Hibernate)，但是很难想像热爱Opensource的人会买MS的帐。

“但是j2ee在Web Services领域落后于.net。”
Looks like you immediately started another debate without launching a new thread, I am on Webservices for only few months, certainly not qualified to either deny or confirm your statement, it’s just too big to me. So let me just point out few things I am quite sure that do not fit well with your rant.

People in WS field used to praise GLUE as “the .NET of Java WS”, as it’s very easy to use, I only use AXIS so far, but by reviewing GLUE’s feature list, it’s quite obvious GLUE is ahead of AXIS. But to me, AXIS is already easy enough, not a big differentiator. Actually I think Webservices is much much more than just make a java/C# class get published and be able to speak SOAP, the core value is about orchestration of B2B, EAI in an automated, manageable fashion. As a matter of fact, both M$ and the rest of world are joint effort behind it. Unlike O-R mapping game, in my personal view, we probably have seen the beginning of the end, for WS, the industry has merely reached the end of beginning. Too early to tell.

Performance concern over AXIS? believe me, the folks there simply did not put enough effort on it yet. I am interested in massive XML handling, normally for API requires complex data type mapping, I use Doc/Lit over RPC, taking care of XML parsing myself. By non-intrusive byte code manipulation of BO, I retain the mapping flexibility and Business object centric advantage, and it’s very fast. This is just one of many options that they can apply to speed things up, maybe they already done it in their CVS.

“比如JDO，讨论了4年半1.0spec才姗姗来迟，”
That’s the way of democracy ^-^, otherwise who would care much about Castor, Ofbiz, Expresso, I might come up a TOPAS PM which is 100% JDO compliant( or maybe I won’t do it at all ).
M$ may have implementation out quicker, but it take years for others sort out what’s about, by the time ISVs are ready to deploy their business on it, you know what, m$ outdated it. “热脸贴冷屁股",many ISVs had such kind of experience.

I enjoy the trash talk about M$, but I admit no body should underestimate “The power of dark side”, also their presence is always a good news to end users.

Okay, I've said too much. Now something serious: Robin, I am looking for help to integrate .NET with Java, can you introduce someone who have interest and good at Visual.NET? thanks
in advance.

-Jevang