Archive for the ‘architecture’ Category

I realized a common analogy would be a useful tool to help non-engineers and non-developers… why not a car? A modern vehicle is a feat of engineering. Rolling down the highway at 70mph means all sorts of vibrations and harmful oscillations must be damped out to prevent the vehicle from destroying itself. Modern vehicles like my truck actually have a LAN-like network called a CAN(Controller Area Network) that connects everything from the transmission to the radio on a single digital bus. A lot of research and development goes into making a vehicle, so how does that compare to software development? Engineering a vehicle has many phases, the early stages being definition, planning, and experimentation; lets focus in on a few of these.

Definition

The first step of any development process should be definition, or creating a vocabulary for the problem space. The goal is to make sure everyone is talking about the same thing. For instance, in web application development, a user and a customer can be different people!

While designing a car, there is driver, passenger, steering wheel, e-brake, and many more. These terms must be used consistently in the development lifecycle. Ambiguities must be sought and destroyed. For instance, is a driver also a passenger? Or perhaps driver and passengers are both occupants? This critical step is glossed over by Agile methodologists that want to jump right into coding.

A great (and free) resource on how to create a business vocabulary is Domain Driven Design Quickly, by Eric Evans. Eric steps through the process of creating an air traffic control system. Nearly an entire chapter is spent defining the actors, nouns, actions, and relationships in the air traffic control ecosystem before introducing any DDD concepts.

If you’re a non-engineer, you should demand your developers create this vocabulary with you. As you’re painfully aware, communicating what you want with developers is difficult. Creating a vocabulary is a communication channel. Do not skip straight into development or design without it!

Planning

When GM had automotive engineers design the Chevy Volt, it’s unlikely they that put a bunch of sticky notes on a board that said, “A user should be able to steer the car in a particular direction,” then ran off willy-nilly and built out the steering rack from modern hydraulic actuators. Later on, in sprint 25, when they got around to building the engine, they realized that a hydraulic steering rack isn’t ideal for an electric car. Turns out, hydraulics require a constant pressure from a turning pump, and an electric vehicle’s motor isn’t turning at red lights.

They didn’t spend 8 years drawing on paper either, then started building the car. Each time they designed a component of the vehicle, they had to change the plans for the rest of the vehicle. Their design was iterative and if you look at the early prototypes, they look nothing like the final product.

Neither extreme is healthy and unfortunately Agile developers tend to want to get started quickly without thinking of the big picture. Old school waterfall developers like to think that once they develop a plan, they should never ever change it, and also think that they can foresee all the problems that would arise with a particular design.

Software engineers should fall somewhere in the middle. After user stories are somewhat complete, a domain model should be developed that shows object attributes and how they relate to the rest of the domain. After the domain design is created, an architecture plan should then be developed that shows how objects stored, processed, and shuffled around the system. Cross cutting concerns, such as logging, monitoring, and auditing should be introduced at this point.

Unfortunately, there is not easy road for non-engineers here. Your developers should have a technical plan, and they should be able to explain it to you in a way you understand. Each iteration the technical plan and model should be updated and changed. If your developers actually keep the same technical plan between iterations, this should be a red flag: No one gets it right the first time. During development, your engineers will encounter business scenarios and technical problems the team simply didn’t think of. Each iteration should be a refinement, or even a roadmap change if you really just “got it wrong.”

Experimentation

As I said earlier, GM didn’t spend 8 years drawing the design out on paper until they thought they solved every possible problem that popped into their heads. After they defined their vocabulary and came up with a basic design, GM experimented with 25 different battery chemistries in real life using various test platforms to validate their design. Turns out, some manufacturers had no idea that their batteries tended to explode.

Research and development is something you can actually write off on your taxes. R&D should be done at the beginning of every iteration, after design, before development. Creating, measuring, and analyzing the performance, productivity, and maintainability of a particular design will help you weed out expensive mistakes. Developers should concentrate on creating specific goals, and communicate effectively to their peers what they did, why they did it, and what happened. A wiki is a great place to store R&D results and later, when creating the tax writeoffs, accounting can show that actual R&D was done.

This is an area that developers love, and business managers don’t understand. Business managers tend to think playing with shiny development toys is a waste of time when ‘real work’ implementing requirements should be done. (If you work for a company like this, I encourage you to quit your job. These types of factories deserve to be talent starved and are not worth working for)

The trick to making sure that real work is done is timeboxing R&D and setting goals. Specific goals should first be defined, then an appropriately sized timebox is allotted proportionately to the experiment. Specific goals might be evaluating one framework over another, or one design over another. Large R&D boxes should be allocated at the beginning of the project, with small ones at the end. Again, the goal here is to avoid waste. Being proactive rather than reactive, while counterintuitive, will only make your project succeed.

…there are things that we now know we don’t know. But there are also unknown unknowns. There are things we do not know we don’t know

-Donald Rumsfeld (An odd person to say such a quote)

The ultimate goal of R&D is to discover the things you don’t know that you don’t know. Discovering failures proactively saves you time and makes developers happy.

Conclusion

The difference between “coding” and “engineering” is that engineering wants to be deterministic and predictable. Coders tend to jump straight into things without giving long-term business goals thought, then jump ship when the waters get rough. The solution is to follow all steps of an engineering lifecycle while producing engineering artifacts such as models and plans. Finally, just like a car, models and plans require maintenance, and should be changed and upgraded frequently during the engineering lifecycle.

In the comments, how about some stories where upfront design was skipped? Also, lets hear some stories where management locked engineers into a failing design!

What is the secret to good architecture? While there are plenty of posts written about this topic on the internet, I found this article on DZone (originally posted here) intriguing.

The article is written by Niklas Schlimm, whom I do not know personally. The article can be summarized with three points:

Three [proposed] Laws of Good Software Architecture

  1. Good architecture is one that enables architects to have a minimum of irrevocable decisions.
  2. To make decisions revocable you need to design for flexibility.
  3. To make use of flexibility one needs to refactor mercilessly.
CYA Engineering?
  1. What I think Niklas is trying to say is: you will get it wrong, and sadly, you’ll get it wrong most of the time, so design an architecture where you can fix your mistakes with a couple of swaps. Is designing an architecture that allows you to revoke your decisions at whim really fair? Does the CEO get the same privilege?
  2. Flexible is not durable. They don’t build make buildings out of rubber. We’ve probably all worked on over-engineered legacy systems that were “flexible.” Most times, good intentions turning into dancing circles around libraries and frameworks in an unholy ritual of abstraction.
  3. You should be refactoring mercilessly, all the time. But do you need 10 layers of abstraction to do this? Do revocable decisions allow you to revoke custom code?
Alternative laws

Lets address the problem here. Code doesn’t age, or stop working, it’s Human’s that forget how things work, why decisions were made, and becomes disenchanted with the past. The input to the system changes, usage levels increase, and people leave the organization. The only real constant is the code itself!

Given that, I present:

Laws of software developed by Humans

  1. Humans cannot predict where software will be extended in the future. Instead trust in YAGNI and dependency injection.
  2. Humans cannot predict where software is slow. Instead we have profilers and the three rules of optimization.
  3. Humans cannot predict when or how business direction will change, the economy, or future technology enhancements. That’s why we have iterative development: Agile, <a href=”http://en.wikipedia.org/wiki/Scrum_(development)”>Scrum and XP.
Refuting the laws given with the alternative laws
  1. Good architecture is one that enables architects to have a minimum of irrevocable decisions.
    Alternative law #3 says you won’t be able to predict which decisions are irrevocable. Instead of designing an architecture with revocable everything, design the minimum possible architecture that barely scrapes by to get the job done.

    Instead of developing your own bash build script, use Maven, and don’t do anything silly with it. “</amvn clean install” and “mvn clean deploy” should be all the commands you need to send your artifact onward.

    Instead of getting locked into a particular open source project or vendor, use JEE6 standards like JSF2.0, JPA2.0, JMS, and EJB3.1, or commit to using Guava and Guice. Simplicity is key, less is better. Using tools they way they were intended saves time.
  2. To make decisions revocable you need to design for flexibility.
    Alternative law #1 says you stink at designing for flexibility, so don’t do it. Rather than designing for flexibility, design an exit strategy document for each technology you use. This includes frameworks built in house: how will you exit from those once they become legacy code? (The best way is to abstain from writing them.)

    Alternative law #2 also applies here, don’t use obscure technology like a NoSQL database until you actually determine you’ll need a a NoSQL database by profiling. It you’ve truly designed minimum architecture and opted for simplicity, it will be simple to change out your backing store with a different database. It’s good to remember that you are not Google. If you design simple architecture, you’ll have plenty of time later on to switch to NoSQL when it’s needed, rather than rewriting a thousand lines of Redis specific JSON handling code upfront.
  3. To make use of flexibility one needs to refactor mercilessly.
    Given Alternative laws 1, 2, and 3, this brings old law #3 into new light. You should refactor mercilessly and eliminate technologies. Gage yourself: How many new technologies did you bring in versus how many exit plans did you execute?

    Develop a master engineering architecture plan. Every release should bring your closer to the master design, at which point, you should update your exit plans then update your master architecture plan. What technologies could minimize the architecture? Evaluate them… do they require glue code to be written? What frameworks can we exit from and what technologies aren’t working out? Get rid of them, quickly. What in-house utility projects (aka future “legacy code”) has been written? Open source it, replace them, or better yet, observe that YAGNI has manifested itself.
Conclusion, flame wars

When it doubt, don’t write it. Change your processes so you can use a tool without modification. Don’t dance around frameworks with unnecessary abstraction. Above all else, simplicity is key!

Finally, remember this is an attack on complicated software, not the author of the original post. I’ve enjoyed reading his blog. I look forward to your comments!

Perhaps I’ve had a bad week, but I am not a fan of DDD… today. Maybe it’s because I’ve spent close to 50% of my career refactoring code that was over-engineered, unclear, or redundant… and DDD just happened to be present. Maybe the previous generation of programmers sat around the water cooler discussing design patterns, and they agreed DDD was second best only to pre-sliced pancakes. Maybe I haven’t really taken the time to understand when DDD really works, and that my sorrows and anger are misplaced. Perhaps many more agree with me, and I could incite lively discussion on DZone or the comments below…

I think it was my uncle who I first heard the phrase uttered, “Always walk a mile in the other guy’s shoes… so you’re a mile away and you’ve got his shoes.” Lets start with what DDD is, a simple example, and some of the pitfalls I’ve had to deal with this week.

DDD is Domain Driven Design. In a nutshell, the idea is to take the Object Oriented Programming (OOP) principles, understand them, then make them pervasive in your application architecture. The original concept of OOP was to bundle data, and logic that guards said data, together in a structure called an Object. The theory is that you model Object after real-world behavior, or a popular buzzword these days is “business logic.” Lets put the business logic right with the business data and pass it around. This is very OO! Eventually you’re so OO, that your ‘view layer’ applications are essentially just reflections of your domain layer… and actually DDD an entire process, but I’m going to just focus on the coding part for this post…

A good start would be the old, oft-abused Bank Account example. Lets have an object that represents someone’s bank account:

public class Account {
 private long balance;  // bitcoins
}

This has data, but no business logic, so we can’t get to their account balance. Lets add some:

public class Account {
 private long balance;

 public long getBalance(){
  return balance;
 }

 public void decrementAmount(long amount){
  balance = balance - amount;
 }

 public void addAmount(long amount){
  balance = balance + amount;
 }
}

Well this isn’t bad. I can clearly see what’s going on here; the methods are self-describing, and the code is clear about its intentions. I’m really feeling good about this actually.

public class Account {
	private long balance;

	public long getBalance() {
		return balance;
	}

	public void decrementAmount(long amount) {
		balance = balance - amount;
	}

	public void addAmount(long amount) {
		balance = balance + amount;
	}

	public void transferAmountFromAccount(long amount, Account foreignAccount) {
		foreignAccount.decrementAmount(amount);
		addAmount(amount);
	}

	public void fireInterestCaculation(double interestRate) {
		// office space arithmetic
		long interest = (long) (interestRate * balance);
		addAmount(interest);
	}
}

Well this is interesting… In accordance with the DRY principle, the transfer and fireInterest methods begin using other public methods internally. We could continue, maybe throw exceptions for NonSufficientFunds by calling getBalance first, or maybe even make fireInterest a recursive calculation. The reuse gets even better once we get to polymorph, extend, and generalize our objects. As we use the OO principles, we continually invoke the information hiding principle and complex business logic becomes nothing more than a glorious stack of method calls… And by now you too are thinking such a method stack could only be rivaled by a stack of pre-sliced pancakes.

So what is the opposite of DDD? An ‘Anemic Domain Model’, or one which domain object carry data but no logic. Here is the equivalent object in an anemic model:

public class Account {
	private long balance;

	public long getBalance() {
		return balance;
	}

	public void setBalance(long balance) {
		this.balance = balance;
	}

*Business logic is in a service somewhere…

Let me be perfectly honest… I started this article very harshly. I actually like DDD. DDD is the manifesting of all OO principles. You’re using the language to its fullest extent. Design patterns flow more naturally. Maybe you have fully achieved the purity of quantified conception… I hear this repeated, “Anemic domain models are leftover from procedural programming,” which to some extent is true. But are they all the bad? No, but with an Anemic Domain Model you aren’t taking advantage of the language. So DDD is the way to go? I think so, but there seems to be a few fundamental limitations with DDD that I think must be managed, and in my career, I’ve seen them managed ineffectively. The articles I’ve read on DDD focus very much on the benefits of DDD and how to implement it, but fair miserably at managing the lifecycle of DDD.

Jon’s DDD Antipattern #1: Your programmers must be domain experts.
This is an oft-omitted detail of doing DDD. Writing DDD effectively means you need to have programmers that understand the language of your domain fully, or, your analysts (that understand the domain fully) must be able to communicate in terms of OO. This should be fairly obvious because effective DDD means translating your domain from terms of accounts and interest rates to objects and functions. Furthermore, if your organization has a lot of turnover, you will be bitten rather than benefit from DDD as knowledge of how your domain was implemented evaporates.

Solution: Retain your experts, both technical and business. Recruit fresh talent into the organization to widen the pool. Group your domain experts and technical staff by domain, not by job position.

Jon’s DDD Antipattern #2: The impatient incompetent programmer
Impatient and lazy programmers are general antipatterns, but I feel in DDD they can really harm you. Remember how above I showed above how we could recycle existing methods in our code? Lets say Jane is late on a project, and she needs to add compounding interest to our Account object above. Because she’s in a hurry, she doesn’t have time to study the domain or the previous implementations. She extends the Account object as CompoundingAccount:

public class CompoundingAccount extends Account {

 public void compoundInterest(int periods, double rate){
  int interest = 0;
  for ( i = 0;  i < periods;  i++ ) {
    // more office space arithmetic
    long interest = (long) (interestRate * balance);
    addAmount(interest);
  }
 }
}

This code enters your domain and hundreds of view layers begin using the CompoundingAccount. Next, your company gets sued for not computing interest correctly (as shown here). You, as the domain expert know that in the root object know exactly where to make this change in the Account object. You do so, but because Jane did not obey the DRY principle, she has caused compounding damage to the domain! You could have hundreds of methods and views built on either fireInterest or compoundInterest. Think this won’t happen to you? Be realistic: It’s not a matter of if these events will happen; it’s simply a matter of when they will happen., and if you don’t have a contingency plan or governance, you’ve already failed.

Solution: Recognize DDD is a true double edge sword. As business logic propagates down your domain tree, mistakes will propagate too. Those mistakes will compound as methods are reused, causing a problem. You must setup governance in your domain. You must have centralized domain experts that develop and can tell you the correct place to implement new functionality as your domain expands.

Finally,
Jon’s DDD Antipattern #3: DDD is made ineffective in a distributed environment by most SOA implementations.
This is also frequently omitted. Remember how we said that classes were designed to hold state and business logic together? What happens if we translate the Account object to XML (a very common SOA protocol)

<account>
 <balance>375</balance> 
</account>

Where did my business logic go? It isn’t transferred… Protocols like REST, SOAP, RMI, Hessian transmit the state of business objects, but they do not transmit the implementation. How then, can you transmit an object between two independent systems and guarantee that fireInterestCaculation will create the exact same value? You actually can’t. The implementations on both ends of the wire must be the same, which you can’t guarantee in a distributed environment.

Solution: The only thing I can think of right now that might come close to resolving this issue the little used feature of Java RMI called Remote Class Loading. Remote Class Loading will sense that a client does not have an implementation, and load the implementation across the network. Maybe this is a solution? I haven’t experimented with this.

So as I sit here slicing my pancakes, I’ll admit I’m not opposed to DDD. In my career, DDD has simply been an innocent bystander in the war of egos vs time-schedules. The problems with DDD have been in its execution, not it’s integrity. We’ve all heard the benefits of DDD many times over, but I see very little discussion about how to manage its pitfalls. How do you manage DDD effectively? Have you any stories how DDD has because ineffectual? How do you deal with change management in your domain?

Once again, thank you for reading!

Two years back, I was tasked with redesigning the member portal for my company. We had a great set of tools we could crank smaller applications out, and a brilliant team of people who understood them. This team frequently debated best practices in Maven, Spring, Hibernate, etc and the design patterns that surrounded them. Bean naming conventions were tossed about, file naming conventions were invented, and over all, it was pretty good life. However, due to the economy and slow business, development teams were cut back. As the lead engineer on the member portal redesign, the team was cut to two engineers… myself being one of them.

At the time, we built our applications surrounding Spring’s best practices. You could say our applications completely revolved around Spring. We had interfaces for everything: We had domain jars, DAO jars, service jars, facade jars, war poms and ear poms. The jars were built into every application; we had a service jar that knew how to get member objects from our database; we had another just to lookup pharmacies… Each one of these got baked into the applications…. Since each application went to production separately, our platform fragmented into a pile of baby powder. When SHTF (Support Hit The Fan), I think the most common question I was asked was, “Hey Jon, what version of member services is X application on?”

With every app on an ever-so-slightly-different version of a particular service, the memory usage for an individual application was outrageous. Startup times for individual apps was measured in minutes for some. With every application owning its own services, we had to provision multiple datasources for each app. Caching was unheard of… Things really started getting out of hand when our customers wouldn’t migrate between versions of our applications or SOAP services (who can blame them?). Quite slowly, we had a sprawling portfolio of slightly similar production applications that we had to keep running. Budgets and teams were shrinking, and this massive project had to take flight; deadlines were just over the horizon, but disaster was brewing in the skies and the sound of delays thundered in the distance.

Philosophically, we claimed we didn’t build monolithic applications. After all, our applications had a “separation of concerns,” only in the sense that each concern was a separate jar file. We had “services”, but these services were not distributed nor redundant. We knew we had to go the distributed route… so we started to “fix” our problems by making internal SOAP services available (After all, in Spring, everything is a bean, so why can a proxy be a bean?) Our latency skyrocketed. Function calls that took 1ms now took 40+ms. We had to limit these to very coarse-grained calls, but that just caused our SOAP payload size to swell. We brought in individual caches for SOAP calls, but now we faced memory and coherency issues.

Back to the project, realizing that we had in fact been building “monolithic” applications, we had to go distributed. But how? My coworker said, lets just replace our SOAP calls with REST calls and we’ll be in great shape. So we set off, having my teammate write  the frontend in JSP/YUI, while I concentrated on developing the backend services and our usual massively complex Spring assembly. Benchmarking my calls, I saw we had gone from 40ms SOAP calls to 30ms of REST calls! What??? REST was the answer to all that was life! Panicking, I profiled the CXF/Spring code and realized I’m spending all day parsing strings! The only real benefit rest bought me was there was just less strings to parse!

I stopped. I know I’m supposed to “optimize last”, but clearly, our design or technology choices were flawed. The services had complex bindings and a million strange annotations littered about them. And oh man, did we have a ton of Spring XML files. Something had to change…

About that time, I stumbled on post by Adam Bien … Take a look at some EJB3.0 code he published:

@Remote
public interface BookService {
Book createOrUpdate(Book book);
void remove(Book book);
Book find(Object id);
}

To use invoke that service remotely (to ‘inject’ or ‘lookup’ that service) you do this:

@EJB
private BookService  service;

Wait, isn’t EJB code supposed to be complex? Isn’t it full of XML, vendor specific deployment descriptors, RemoteExceptions, Portable Objects, Narrowing, IIOP, CORBA, RMI, Lookups, LocalHomes, RemoteHomes, Object pools, RMIC compilers, IDLs and IntitialContexts? Doesn’t every method have to throw annoying checked exceptions? And most of all, when the hell did EJBs get Java annotations???

Here’s the crux of this post: The above code is expressive and clear in its intent. Even a Ruby programmer (joke) could see that we’re defining a service contract here that will be offered remotely. This style of definition delivers a certain elegance in its simplicity.

…and I realized that I had a possible solution to my problems. I quickly ripped out my Spring/CXF/REST job  and reduce the clutter to a few lines of code. I deleted my Spring files and flung XML into the recycle bin. The moment of truth came with the benchmarks… So EJB under the scenes uses Java serialization… and it’s fast… a lot faster than SOAP and still faster than REST. My response times  dropped below 10ms. Payloads that were single lines of text dropped to 3ms or less… wow!  Better yet, getting rid of the Spring Jars, the CXF jars and their laundry list of transient dependencies took our deployment artifacts from 20 megabytes to 700 kilobytes. Now that is lightweight! I got approval to move forward with the technology and I surged forward, ripping to shreds my previous XML excursion in a wake of JEE.

Now EJB3.0 isn’t without its own set of problems, which I will cover in later posts, but overall, it offers several key advantages:

  1. It’s lightweight: You can write full services using 4 annotations and deploy under a megabyte. I haven’t seen applications that small since Windows 3.1.
  2. It’s fast: Well faster than REST/SOAP… I’m sure there are probably faster binary protocols, but hey, Java Serialization is built into the JVM and it spanks any text protocol.
  3. It’s familiar: You still have to be conscious that you’re in a distributed system, but most developers have used Java Serialization. For most POJOs, you don’t have to do anything except implement Serializable.
  4. It’s transparent: It’s amazing. The EJB3.0 annotations are minimal and powerful at the same time. Developer productivity is high, and there is no generated boilerplate code. It’s easy to teach these concept to people who didn’t bear the tragedy of EJB2.1.
  5. It’s reliable: EJB has always been a cluster-able technology if you application container supports it.
The story doesn’t end, as this was just the first iteration of the application. The new member portal application is happily humming along in production. We’ve had a few hiccups with WebSphere 7, but for the most part, it’s been relatively problem free. We’re also finally starting to reap the benefits of a distributed SOA architecture. Our applications share connection pools, we’re cutting down to a few core services, deployment artifact size is 20x smaller, and memory usage is dropping.
In future posts, I’ll write examples of just how easy it is to get these services running, and why Java is a great distributed language when coupled with JEE6. You can have developer productivity (and fun) without sacrificing reliability on bleeding edge dynamic languages.

Footnote:

A great resource for EJB3.0 is the free book Master EJB 4th Edition. I read the entire thing in a weekend on an Evo4g. It’s brief, but it will give you a working knowledge to get off and running.
Also, Adam Bien‘s weblog is a great resource for everything EJB3+ and JEE6.