måndag 30 januari 2017

Maintaining Legacy Software

Weeds

"Weed is a plant that has ended up in the wrong place. Either it got there through carelessness or it has been allowed to germinate and grow from being insignificant. "(New Farmers Handbook)
Software is full of weeds. With weeds in this case I mean badly architectured software and not pure bugs. But, believe me: bugs arize from this later even if the are not there yet. This is not only from sloppy programmers and architects. In my last blog post I wrote about architecture in legacy software and with weeds I mean parts of the software not adhering to the architecture. Vines going between  layers in the chosen software model.

When creating an architecture we often have high ambitions. Whichever framework we use: Domain Driven Design, Model View Controller or similar. Folders and modules created for the then current ways of working and separation of concern.

Since starting work, the product owner has ordered thins and you have had to solve problems. You are  forced to be pragmatic to create the feature that should be in the sprint. An advanced architecture sometimes feels as if it opposes the simple solutions required to handle all the problems you need to solve. You take shortcuts and allows a view to do something that a controller should have done, only because it is convenient or because you do not want to infringe someone else's code.

The project also evolves as it continues. Someone new takes over parts of the code and start writing things. This new person has a different view of architecture and it starts to become inconsistent. Different parts of the code follows different cultures and the cultures evolve. The way of writing code over time is not the same. Old code can have fine "structured programming" as the paradigm was called then. Then there might be a time of object orientation. Object orientation itself has evolved from complex objects handling everything from infrastructure to apis in the same object to modern ways of looking in separation of concern. The revolution of patterns-thinking can leave traces. It is like archeology with cultural layers to dig through.

Weeding

It is important to remove weeds. Keep in mind that a lot of weeds can be useful to others. It's often something that has ended up in the wrong place, rather than being wrong in itself. A good refactoring tool allows it to be moved to the right place with some clicks if you know where to put it. Step one is to separate the misplaced code to a new method if it is not separated from the beginning. Often the IDE will have refactoring tools to help with extracting a new method. Then move the method to the class it belongs to. Is it not useful at the moment so “compost” it: Remove it and allow revision management system take care of it.

Cropping

A problem that is close weeding. The first case deals with branches from the other parts of the code entering other parts like a tear from a plant. This may involve unwanted dependencies where things reach too far into the code to work in. If you need to expose any outward so it requires a clear interface. Creating this and do not expose the interior of your code out. Create a contract, a clear API which different parts of the code communicate with each other through this interface.

Bugs and vermin

This is perhaps the most common form of problems in software. Something that we all know. Bugs can be fought in many ways. Spraying is not uncommon in agriculture and the equivalent in terms of software are different kinds of automatic  linter. Just as with the spraying in agriculture, you should beware of using these tools. They complain, too much so developers tend to ignore them more and more. Maybe you miss large errors because they are hidden by numerous small. Therefore, set your static linter at a reasonable level for your project.

måndag 9 januari 2017

Architechture Legacy Software

The software developer has typically used the engineer as a model. This is obviously sprung out that we are dealing with technology. We build stuff! Our line of work grew out of electrical engineering and mathematics, hand in hand, it became a new profession.

Development also has a soft side that you might not see from the outside and all developers not themselves discover. They become stuck in problem solving and happy that things are working. Programmers borrows many words from linguists. We have the language to describe what we do, languages have a grammar and helps us to communicate. Communication is not only between us and the computer, but also with other developers. Really well written programming code can be like poetry for those who understand it. Code is in many ways a cultural manifestation, almost art.

Most books and articles we read about software development often describe an ideal stage. You get recipes on how to make the project design and how to write their code to create good software. We who have worked with software development and are reading them usually agree with that there are good principles, but in reality, it usually not looks like that. We are moving in a landscape where the old code lives as archeological layers under our feet and we do not know what we will find when we dig among them.

The layers also have been moved around by changes by several developers through the history of the code-base. The way to write code has changed through time. What was considered the best practices a few years ago is no longer so today. Different and changing ways of looking at what is right through time and for different developers collide and makes the code more messy. Each developer has his or her own perspective, often good, in their context.

Software systems are often described as machines. The technical background shines through. To me they are usually more organic: small systems are like gardens that are sometimes tied together in large systems and thus the entire cultural landscape. This is an attempt to describe software development from the perspective of software is something that grows under our supervision rather than constructed. We are gardeners or farmers who grow software.

Now you may be thinking: Software not grow or is not written by itself. It is of course true, but the body of software being written by a team and over a long time, tend to have almost a life of its own. There are so many factors that influence the elaboration so that everything will not be rationally designed, or constructed. If you do not take this into account then it may become problem, but when used correctly, this can become a strength and improve the system.

Taking care of the software is in many ways similar to tillage. It is a process of cultivation. The idea in this text is to create a good culture to grow and develop software as well as yourself. I would like to describe an iterative process, much like the seasons to adapt the development process to reality, instead of dreaming of the ideal project where each sprint from planning to retrospective merely a moment of happiness which we commend each other and smile drinking coffee with smiles as in a religious advertising brochure.The Swedish Farmer's Almanac (Bondepraktikan) was originally a German book of advice to farmers on how to adapt their work to the seasons and therefore the weather. It eventually became incorporated into our Swedish cultural tradition. The book contains a lot of advice on how to manage farming by reading the signs of nature.


"Pious readers buy me now.
Much wisdom I learn you.
Bondepraktikan is my name.
Read me, you will benefit.
The flow of the year I want you to learn, then you will rule. "

(Bondepraktikan - “Swedish Farmers Almanac”)

This is an attempt to give some thoughts on an almanac for software development.


Planning of the lands


The main objective of planning is to create separation of concerns to keep the system stable.
Illustration 1: A Map of the Lands (New Farmers Handbook)


The term Separation of Concerns means in computer programming to use different mechanisms to separate things that do not have with each other to do. The most known is might be the MVC pattern with separates Models, Views and Controllers.

Often the planning is already done long before you enter the lands. Parts have been added and other parts have been removed. Parts might have been split in not so logical ways and parts may have been growing into each other. Some projects might be quite messy other stay orderly for all lifetime. There might be circular dependencies that makes changes hard to do.

Regardless of the project status you should have a map of reality. Often the architectural maps you get are old on not updated. The best is if you have a system where the code itself is the map or generates the map. By good package and sub-package names you always have a current map.

When planning a new project, refactoring- or adding your own architecture you should have good documented way how you separate concerns.

First note that the separation is in two different directions: The different data (like persons and books) the application handles and the different logical layers of an application (like in the MVC). The data is often separated in different bounded contexts and the application is often separated in layers handling things like storage, business logic and front-end.

There is no “right” way how to do things. Below I present a model that I like and can inspire the model you think is right for your project.

Note that you always should begin in the reality. Use the existing landscape and think of change as a long process. In Swedish history we have had land reforms (enskifte, laga skifte etc) decided by a king and locally applied all over the country. In software this could be done, a land reform all over the system, but it is hard and violates the organic growth. Also it might hinder development during the reform.

That being said here is my model:
Infra-structure
Application
Interface
Domain
Table 1: The layer structure

Application layer

The application layer is responsible for driving the work flow of the application. This layer can be used for transactions, high-level logging, and security.
The application layer is often thin compared to the domain logic – it is just coordinating domain objects without actually performing the work. The application layer also provides the interface access to the domain layer. The application starts up the services and the interfaces needed, but then delegates the work to them.
Most of this layer is often given to you out of the box from the framework you use when you get a standard skeleton for an application using that framework.

Interface Layer

This layer holds everything interacts with other systems, such as web services, RPC, user interface or web applications. It handles the interpretation, validation and translation of incoming data. It also handles serialization of output data, such as JSON over REST.

This layer is often called the Ports and Adapters layer. We can have a number of possible ways of connectivity (ports) and adapters that adapts the domain objects to the protocols and acts as an anti-corruption layer.

Domain objects are never directly visible outside the domain. They have other representations in the protocol. This is essential to keep the system stable. You should be able to change the domain-model while keeping the old interfaces stable. Instead you define DTO-s (Data Transfer Objects) that is a representation of the data sent, regardless of the internal model. A DTO-converter is used to translate between Domain Objects and the world outside. This is even true for communication inside your application. Different bounded context should communicate in a defined way.

If you want to combine DDD with a MVC or MVP pattern in a desktop application this can be where to place your views and controllers, as one of the ports. Another way is to think about the MVC-part of your application as a separate client connecting to the interface layer through the Interface layer.

The domain layer

The domain layer is the heart of the program, and that is where all the interesting stuff happens. In Java, do you normally do one package per bounded context, and each unit includes entities, value objects, domain events, a repository interface, and sometimes factories.

The core of the business logic is here. The structure and the naming of assemblies, classes and methods in the domain layer follows the ubiquitous language, and you should be able to explain to a domain expert how this part of the program works by drawing a few simple graphs and use real class and methodG names from the source code.

The interface to the upper layers and other bounded contexts are provided by one or more services, adapted by the interface- or port and adapters- layer when someone wants to connect the service from outside the domain. The interface is defined by the service.

Repositories and the factories have to use the infrastructure to get access to the database. They thus define interfaces for Data Objects (DO) and Data Access Objects (DAO) used to store entities in the database. The implementation is later done in the infrastructure, but the domain is in charge of the data and logic. The domain drives the development.

The relationship between the repositories, factories and the DAO might be hard to understand. The responsibility of the DAO is simple storage. All domain logic should ideally be in the repository. If you for example need to validate data before storing the repository should handle this and then use the DAO to store the data of the entity. Most calls to repositories have the structure of first doing some domain logic and then store the result with the DAO as delegate.

One DAO should only handle one type of DO-s. Likewise a repository ideally also should only handle one type of entities. To handle more complex operations, including many types of entities, you should create a domain-service that uses the factories and repositories. A domain service should cover one bounded context.

We will get back to the relationship between a domain entity and a data-object. For now we can say that the entity is the abstraction of the data we handle and the data-object is the actual data stored by the infrastructure.

This is what makes some people think the model is a bit cumbersome: We have to define a lot of interfaces in the domain that later is implemented in the infrastructure. The different abstraction levels also add to the complexity. The system also have a lot of classes in the domain handling the same entities: The services, repositories and factories. The system becomes a bit heavy to handle with these extra layers of abstraction. The benefit is the separation of concern and the possibility to change infrastructure without having to change the domain logic.

Infrastructure Layer

In addition to the three vertical layers, there are also infrastructure. The application runs in some environment where it has to use infrastructure-services.

In the farming analogy this is the machine park. The solution should not depend on any special machines. You should be able to “grow your crops” with different sets of machines and replace them as time passes. As the picture shows, supports all of the three layers. This is done in different ways, which facilitates communication between the layers. Simply put, the infrastructure consists of everything that exists independently of our application: external libraries, database, application server, messaging and so on.

All communication between the layers to be abstracted so that it is independent of changes in other layers. The clearest example of this is that the domain layer uses Repository to manage storage. This repository then define interfaces for the database connection against infrastructure, but the implementations are in the infrastructure layer.

Although it can be difficult to give a hard definition of what type of code belong to the infrastructure layer for each given situation, it should be possible to completely replace infrastructure and still be able to use the domain layer and possibly the application layer to solve the key business problems.

Bounded contexts



Now we are entering the main part of planning your land: To the domain.

Context

The environment in which a word or proposition that determines it's meaning.

Bounded contexts are a breakdown of the domain. All major projects covering different sub-models and when the code based on the different models are combined, it can become complex. Communication to and from, and within the team gets confusing. It is often not clear in which context one model should not be applied. The model must be divided in limited contexts with clear interfaces to other parts of the system. Each bounded context shall be independent of the others.

When a number of people working in the same bounded context, there is a risk model becomes fragmented. The more people who are inside, the greater the problems. This breaks down the system into smaller contexts and eventually lose the integration and coherence. There must be a process of merging all code and other artifacts with automated testing. Use the ubiquitous language in order to hammer out a common understanding of the model and concepts. Continuous integration is an essential part of the DDD.

Implementations that have grown during time with no good analysis of the bounded contexts often has a design pattern that could be called BBoM – Big Ball of Mud. The borders between the parts of the server are unclear. Even the responsibilities between the client and the server are entangled. There is no good way to see the protocols.

Context Map

A bounded context leaves some problems in the absence of global perspective. The relationships with other parts of the model may still be vague and constant change.

Developers of other teams may lack knowledge of your context's limits and will unconsciously make changes that blurs the edges or complicating relations. When the connections to be made between different contexts, they tend to bleed into each other.

Identify why each sub model in the project and define its distinct contexts. Name each bounded context and include the name in to use a ubiquitous language. Describe points of contact between models and their communications. Preferably, the communication is through service-classes and not directly into other defined context.