Hello world!

Welcome to WordPress. This is your first post. Edit or delete it, then start blogging!

Posted in Uncategorized by yanky. 4 Comments

software quality-Joel Test

http://www.joelonsoftware.com/articles/fog0000000043.html

Posted in Uncategorized by yanky. 48 Comments

Yet abother RESTful API that's not RESTful at all

I just run across another claimed RESTful API published by xiaonei.com. However, IMHO, it is not RESTful at all. That must make Fielding frustrated again. It is actually POX over HTTP. Obviously REST has been used as brand to mean buzz word compatible thing, that’s what Fielding don’t wanna see. Please see my previous post about what does REST really means and how.

Posted in API REST web service by yanky. 86 Comments

what does RESTful web service really mean and how?

Although SOA is a buzz word more often cited in enterprise application context, SOA also applies in Internet web application context. This is where RESTful web service comes into play. Most web2.0 websites are building sort of RESTful web services for mashup application development. However, many claimed RESTful web services are not RESTful at all, which have made REST father Fielding a bit frustrated. Maybe some guys just don’t want real RESTful stuff, so they kinda relax a bit. Fielding also talk about what we get if we relax REST style. The point is there is no high REST or low REST. You can only be RESTful or not. There is no middle ground.

If there are some misunderstanding about REST, here are the basic constraints specified by REST style:

  • Resource oriented: each resource may have multiple representations. Resources are nouns in the vocabulary of systems. In http, it is about MIME types and content negotiation(Accept header).
  • Unique identification of resources: each resource has an unique identifier. In http, it is about assigning URI to resources.
  • Links between resources: resources should be linked so that agent can navigate from one resource to another. In http, it is about building hyperlinks between resources like sitemap.
  • Uniform interface: Not like we specify a contract with programming langueage interface, REST requires an uniform interface for all kind of resources. In http, the interface is a fixed set of verbs: GET, POST, PUT, DELETE, which can be nicely mapped to CRUD operations in applications.
  • Statelessness: Yes, REST is stateless. Even though we use session or cookie in http overwhelmingly. That’s the wrong way for scalable distributed system.
  • Layering: you can plugin some kind of intermediary like proxy or cache or authentication between client and server. It is like adding some filters in between if you like.

If this summary is just too boring to read, you can refer to real-life examples of RESTful web services. Sun has published its RESTful cloud API with json MIME type. Google Gdata API is another good example with Atompub protocol.

If you are already familiar with REST stuff and want to go for it, please read this IEEE article on the best practices of developing RESTful web services by Vinoski.

Here is my take on how to develop RESTful web services:

  1. identify resources you want to expose and build a resource model like SUN cloud API resource model. E.g. building a resource model for a bookshop: shop, room, book, category, review, shoppingcart, order, bestsellers, featured, discount
  2. select an MIME type, json or Atompub or customized xml, to specify representation for each identified resource. E.g. specifying representation for the shop resource in json format:
    Content-Type: application/vnd.com.myshop.shop+json

    { “name” : “xxx shop”, “url” : “http://www.xxxshop.com”, “rooms”: [{"name" : "science","url" : "/science"},{"name" : "management", "url" : "/mgmt"}], “shoppingcart” : “/user_id/cart”, “order”:”/user_id/order”, “bestsellers”: [{"name" : "best 1", "url" : "/book/1"} , {"name" : "best 2", "url" : "/book/2"}]}

  3. now we have resources and representations of resources. In other words, we have nouns. So what do we do next? Yes, verbs. We should attach verbs to nouns for interface definition. E.g. for resource book, user can GET and can not POST/PUT/DELETE. Maybe adminstrator can do update operation. But for resource review, user can GET/POST/PUT/DELETE.
  4. we get a quite good RESTful interface, but not implement it yet. So we have to choose a convienient programming language to implement the interface. In java, maybe jersey is the best choice. Ruby has quite good support for RESTful web service by ActiveResouce package.

If I get enough time, I will make a complete example of RESTful web service design using online bookshop as demo application.

Posted in REST web service by yanky. 32 Comments

SOAP is boring, we need more REST

Since SOA became a buzz word, web service has been touted by vendors as the holy grail for EAI or even more, restructure of the existing IT architecture to get a SOA brand. However, if SOA is not driven by real business needs, it must be doomed. So if we have have done extensive cost benefits alaysis of SOA and concluded that we will have one, how can we do that? Basically SOA is a top down approach, because implementation of SOA need a holistic view of the entire IT infrastructure. Maybe there are incremental ways for SOA that I don’t know. As a top down approach, it is more about governance rather than devise of a cutting-ege fancy technology to implement it. However, technological aspect somehow determines the adoption rate of SOA.

For now web service is the mainstream implementing technology for SOA because big vendors are driving behind it. But recently there is a new buzz word REST. REST has been a hot topic in many technology conference, like http://enterprisewebconf.com/sessions.html, the state of REST vs. SOAP, intro of REST, qcon presentation about combining REST & WS-*. The most interesting one I have watched is by Steve Vinoski, who was been in trenches for decades. When a CORBA guy is talking about distributed system, we shoule be listening. So what is this guy really talking about? Well, RPC is fundermantally flawed, REST is a better alternative way to go. That’s what he is advocating. However, some guys don’t buy it. Hot debates happened here, here and etc. One of points I think make sense is that it depends on what kind of control you have on the system to be built. If you have total control of all of the end points of the system, RPC can be used for optimized performance; on the other hand, if some of the end points are outside of your control, REST is a better alternative. So, in this reasoning, SOAP just doesn’t fit into the space. Here is an extensive comparision between WS-* and REST.

UPDATE: I just run into this post about what Gartner coined as WOA(Web Oriented Architecture). Actually WOA is just an attempt of Gartner to make a new brand of its own from REST. Nothing new. On the other hand, Gartner proposed WOA as constraints of WS-* stack. How this can be done in the real world? I suspect vendors have motives to do it.

Posted in REST SOA web service by yanky. No Comments

A note on software architecture style classification

Architecture style of software system has evolved for decades. We can classify these styles as below.

1. No Architecture
no unified principle,thus no architecture
a integration task needed to plug into the whole enterprise after each
application developed
applications interact in a point-to-point way
each application has its own data store
interface bloating with O(n*n)
also referred as “post integration”
drawbacks: lack of semantic consistency
uncontrolled data replication
result: tight coupling, ripple effect

2. The Integrated Database Architecture
unified data model with clearly defined semantic
applications interact through a single data store
a single data store also a giant “global variable”
still result in tight coupling and ripple effect

3. The Distributed Object Architecture
OO Model ensures consistent semantic
still result in tight coupling, vendor lock-in
examples: EJB, DCOM, CORBA

4. Message Broker(Hub and Spoke)
Star-like topology
applications interact through the central broker
add a intermediary between applications, thus application can be removed or
replaced without effect on others
drawbacks: single point of failure
limited scalability
example: Web Methods

5. The Message Bus Architecture
Flexibility is one of the most crucial qualities of modern organization
Imagine main board bus architecture in computer
return to Integrated Database Architecture but difference remains
applications interact by sending message conforming to a message schema
drawbacks: proprietary messaging protocol, vendor lock-in
security risk including network flooding
message format adaption
example: TIBCO Rendezvous

6. Hybrid Architecture
virtual group
each group contains nodes acting as broker and bus
example: Microsoft BizTalk

6. Service Oriented Architecture
service everywhere
each application exports its own function to service which can be consumed by
other application
also each application can import services provided by other application in
implementing its own function
Put it simple, each application can be both service provider and service consumer.

Conclusion
1. No silver bullet, no one-size-fits-all solution.
2. No perfect architecture, only appropriate architecture
3. Big upfront design is less feasible than incremental iterative design

Posted in software architecture by yanky. 25 Comments

some new stuff worth a look

I came across the InfoWorld’s 2008 best open source software awards yesterday. Today InformationWeek’s Top50 startups list pops up. Some of them definitely worth a look.

1. Git: a distributed version control system that has been used for Linux kernel, fedora and other important open source projects with geographical distribution characteristic.

2. Intel Threaded Building Blocks: an open source cross-OS x86 c++ template library for parallel programming. The essence of this library is a work stealing scheduler. There is an equivalent API in java called fork-join framework that is under development.

3. Alfreso: open source Enterprise Content Management alternative for MS Sharepoint. Most java projects use Confulence Wiki for similar purpose, but ECM solution provides more rich feature set.

4. Hyperic HQ: comprehensive open source application and system monitoring solution

5. Pentaho: open source Business Intelligence Suite originated from another comprehensive machine learning algorithm package Weka. Note: I have tried Weka for web page classification. It is more lightweight and developer-friendly than other open source alternatives such as RapidMiner.

6. Vyatta: open source router, firewall & VPN solution/claimed Cisco alternative. Ambitious! Here are some intro webcasts. And here is a comprehensive review. Another similar but more academic project is XORP. Ops! It seems Vyatta was really derived from XORP. Anyway, we can consider to use it as a replacement of Cisco low-end products. More importantly, students can download it and build a virtual network lab with VMware-like virtual machine software. Thanks for the hard work from these guys!

7. Metasploit Framework: open source penetration toolkit that can be used to hammer application for finding potential security vulnerabilities. Also It can be used for malicious attack.

8. Splunk: open source security log analysis framework that can analyze logs from various sources to find out security threats.

9. Amanda: maybe mostly used open source backup solution.

10. Abiquo: open source cloud computing solution provider, ambitious too!

11. Eucalyptu: yet another open source cloud computing solution, but more academic.

12: openqrm: open source data center management software, not touted as cloud stuff yet, but it can be.

I will elaborate more details when I try any of them.

Posted in open source software tools by yanky. 5 Comments

architecture principles notes

When I watched a presentation by Ebay architect about Ebay architecture principles, I was thinking about how could we figure out what architecture principles we could use in my specific project cases. After all, architecture principles vary from company to company and from project to project. So what does it derive from? After reading some resources, here is my notes.

1. what?

Before we go further, we’d better make clear what the architecture principles are. Here is a definition from TOGAF’s enterprise architecture framework:

Architecture principles are a subset of IT principles that relate to architecture work. They reflect a level of consensus across the enterprise, and embody the spirit and thinking of the enterprise architecture.
……
Architecture principles define the underlying general rules and guidelines for the use and deployment of all IT resources and assets across the enterprise. They reflect a level of consensus among the various elements of the enterprise, and form the basis for making future IT decisions.

Each architecture principle should be clearly related back to the business objectives and key architecture drivers.

It seems way too dogmatic. Here are the guts:

  • They are IT principles.
  • They are general guidelines and rules of utilizing IT resources.
  • They should be well aligned with business objectives.

Here is the components an architecture principle usually contains:

  • Name: representative name with clear meaning
  • Statement: description of unambiguous fundamental rule
  • Rational: highlight business benefits, point out the relations to business principles and relations to other architecture principles, and how to weight them in context
  • Implication: requirements from both IT and business to carrying out the principle in terms of resources, cost, activities and cost. It’s about impact and consequence.

Here is an architecture principles example from Example Set of Architecture Principles from TOGAF’s enterprise architecture framework. Another example is NIH enterprise architecture. Maybe this example is more technology oriented.

2. how?

According to the above interpretation of what, we could only derive these architecture principles from business objectives. Here is the method of running a workshop to draw up them. The key points are:

  • Identify Strategic Objectives
  • Record Strategic Objectives
  • Identify Architecture Principles
  • Explain Architecture Principles
  • Prioritize Architecture Principles
  • Show Prioritization Results

reference:

1. http://it.toolbox.com/blogs/enterprise-solutions/running-an-architecture-principles-workshop-12581
2. http://it.toolbox.com/blogs/enterprise-solutions/sample-architecture-principles-workshop-agenda-12598
3. http://www.opengroup.org/architecture/togaf9-doc/arch/
4. http://www.bredemeyer.com/HotSpot/20040428EASoapBox.htm
5. http://enterprisearchitecture.nih.gov/ArchLib/Guide/EnterprisePrinciples.htm
6. http://enterprisearchitecture.nih.gov/About/Approach/Framework.htm
8. http://blogs.msdn.com/architectsrule/archive/2008/04/22/reference-architecture-principles.aspx

handy system administration and monitoring tools

Just a memo:

maybe the most extensive list on the net about system administration:
http://www.slac.stanford.edu/xorg/nmtf/nmtf-tools.html

Among the list, here are those I have used:

Ntop/Nmon: network traffic data collection

Currently the most popular and also oldest network monitoring tools might be Nagios(network and server monitoring) and MRTG(mainly network traffic). Another perl written one cfengine is getting popular for its powerful rule based management script execution system. Rule language is not new. It has been widely used in business rules engine like ILOG and Drool. It also shines when it is used for system administration. I will give a try if I have a chance.

There are some new open source tools worth a look:
1. OpenNMS: java
3. Hyperic: java
3. Zenoss: python

And it seems old boy is losing favor.

operation dimension of system architecture

In terms of software architecture, there are usually various stakeholders involved in a specific system architecture. Each of them might has different architectural requirements. Product department often submits functional requirements. Operation department often submits system management or monitoring requirements. Accounting department may submits billing requirements. And in some cases the system has its own inherent non-functional requirements such as performance, availability and other SLA guarantees. In one word, a system architecture always involves quite a lot dimensions. We have to think about all of them so as to get a full picture of the system. However, developers are usually myopic so that they rarely think about other dimensions. After all, when system rolls out, developers have to work closely with operation people to get feedbacks about production system. If developers don’t get well prepared, they may end up with getting nothing. Even worse, they will get entangled into operation aspect. Here are some points developers could consider in advance and prepare for.

The first question is how to get production system status?

The common approach is log extensively in the system itself and send notification email when things get abnormal. Simple! But it don’t work when the system is down. And another disadvantages is that application level logging only cares about the system itself. How about machine poweroff or disk failure or network outrage?

So we should have an independent and full functional health management system. Usually this system is maintained by operation department. Then there is a gap, social and technical. The social one is that the two department have to cooperate to make system work. The technical one is about how to make existing health management system be aware of the new system. It depends on both sides. The health management system should be extensible so that it can adapt to any kind of new system. Luckily some full functional monitoring systems qualify. And the new system itself should provide health checking interface that would be called by health management system. So far so good. When system goes wrong, the health management system will get notification in the first place. If they can deal with it, developers can sleep well. Otherwise, developers will get busy.

Another important point is that trust should be built between operation department and development department. Developers should add a lens which can view the dimension of operation to its toolbox. Also system administrators should add a lens which can view the dimension of development to its toolbox, because a full understanding about the new system can help them monitor the system more extensively.

The reason why I am aware of operation aspect of system architecture is that it is getting more and more important today. Service has been a buzzword for years. SOA, SaaS, PaaS, Web Services and so on. So how can we measure the quality of service? Yes, SLA(Service Level Agreement). 4 nines availability and 1s response time. That’s it. But how can we reach that SLA? It is closely related to operation. So be watchful of it.

UPDATE: Here is a good post on the same topic: monitoring java system, but more specifically.