Event Driven: Monitoring & Integration

Sharing logging context across application boundaries

2014-05-16T09:09:00.000-07:00

How do you share logging context across application boundaries? Let me try to illustrate the challenge.

Example: 2 applications (sender/receiver) exchanging requests and replies. How does the sender flag a specific request to be traced or tracked across both applications at runtime?

Well, today java developers who use log4j or any other logging framework would enable debug (you also need to know what categories to be debug enabled) on both applications and you get all kinds of messages in the log. Now you have to analyze the log and pick out entries that are only related to a set of specific request/reply pairs relevant to a specific exchange of interest.

TNT4J provides a facility to share context across applications called shared conditional logging. The idea is to establish a shared pool of tokens (key/value pairs) available to all applications a runtime. These tokens can be added, removed, updated on the fly and therefore logging context or any other context can be communicated to all applications at runtime.

This simple model allows sender applications set a token/value pair and pass it along (out of band) to the receiver. Both apps can check for trace levels on a specific token to determine whether logging is needed. The result is that only specific request/reply pairs are tracked across 2 or more applications.

This approach saves developer a ton of time, reduces the overhead associated with enabling debug mode for all, logs only what is needed and therefore reduces the amount of manual analysis, simplifies diagnostics phase.

I am using this framework in my own project and so far with great results.

Launched TNT4J -- Java Open Source Project for Tracking, Tracing Application Behavior

2014-05-13T06:02:00.000-07:00

I am restarting my blog again after a few years of silence with a launch of a new java open source project (TNT4J) available @ GitHub https://github.com/Nastel/TNT4J. The mission of the project is to deliver production quality logging framework that significantly outperforms existing simple logging frameworks such as log4j, syslog, etc.

Simple logging of severity/message combo is just not enough when it comes to truly distributed, concurrent applications. Frankly I was tired of going through log files and trying to figure out why apps behave they way they do. I created this framework to deal with 3 basic logging problems:

How do I log only what is needed and across applications, runtimes (is DebugEnabled()) simply is not enough and produced too much unrelated data across concurrent apps.
How do I correlate and relate log entries within and across logs that belong to a logical activity such as order process, request etc and across multiple threads, applications, runtimes. Logs are just simply a big mess.
How do I record important metrics and state of the app, business process? Many times I ask myself what else was going on when this error occured? What was GC, memory, my apps internal variables etc, etc. Much of this info is simply hidden and not available.

Of course one could say, "well use profilers and such, problem solved". The asnwer is simple, when you develop and deliver apps to the users and they have problems, what do you tell your end users? Go use a profiler, buy an application diagnostic tool, go debug the application YOU developed (mobile or otherwise). With many copies of your apps running across variety of devices, servers, desktops how do you troubleshoot your application running elsewhere? Most developers request logs and log analysis nightmare begins.

Crash logs have too much system, runtime data and lack application specific data required to understand application logic. To troubleshoot application behavior or misbehavior one needs to know how application behaves and log that in addition to what the runtime is doing (stack, traces, VM info, memory, etc).

Most logs are simply useless and a mess -- either too much data, too much text, not enough context, not enough relations. TNT4J addresses this problem head on. I created this project to help myself with this problem.

I think developers may find TNT4J very useful. Let me know what you think. I welcome feedback, collaborators and adopters. All Welcome.

Business Transaction Management vs. Business Transaction Performance

2010-03-08T08:26:00.000-08:00

BTM or Business Transaction Management vs. Business Transaction Performance -- two terms aimed to describe the current state of the affairs in what Gartner calls Transaction Profiling. Ever since I came across the term BTM I questioned whether the term actually reflects what vendors do in this space. The word "management" implies a bi-directional relationship between the manager and the entity being managed. In the world of Application Performance Management the term management implies "measure, monitor, administer, control, plan, automate, improve". If anything the BTM should be redefined as Business Transaction Performance Management or BTPM. Transaction Profiling (Gartner's definition) while more accurate implies a specific implementation of how performance is actually accomplished -- "profiling". One can envision measuring transaction performance without actually doing any profiling. It seems that profiling is an implementation construct and as such should be avoided when naming a broad discipline such as this. In fact BTM, as defined, is really a derivative of Business Process Management rather than an Application Performance Management discipline.

The term BTM actually confuses the market place. What part of "management" is actually being done by the vendors in the space? Most if not all vendor in this space measure performance and report. Any proactive involvement in the transaction lifecycle itself is minimal or not practical in most cases. How practical is it to define application and business logic within a transaction "management" tool? And even if it were feasible wouldn't it be better to do this in the BPM orchestration layer? Managing transaction lifecycle is already defined by the Business Process Management discipline and as such belongs in the BPM space. Today's transactions are orchestrated and therefore managed by widely known BPM tools from IBM, Microsoft, Oracle and others. So either BTM is part of BPM, rather than APM and if this is true do we really need another term to describe the same thing? or BTM simply is all about performance and therefore "management" should be dropped from the acronym.

No matter what we call things, it is important to understand what these things actually are in reality. BTM, no matter what vendors say focus on performance and measurement. Any active involvement in the transaction lifecycle, while possible, in many cases impractical and in most not desirable for many reasons. So BTM is really about performance, and in my view BTP (Business Transaction Performance) or BTPM (Business Transaction Performance Management) are more appropriate. Keeping terms honest is important and benefits the end user. Why? because we are already awash in so many terms, abbreviations, acronyms, technologies, products and vendors with "super-natural abilities". What we need is simplicity and clarity rather than ambiguity and complexity.

Technology Overload

2009-10-29T15:52:00.000-07:00

I am completely convinced that just like we've produced too many cars, too many houses, too much credit and sadly too many dollars, we have also produced too much technology, software products, packages and solutions. The result is that organizations are not only confused but unable to absorb the technology and products that they already own. Over the past decade enterprises acquired too many products, a large portion of which have become shelve-ware. So what is the response of the corporate CIO -- vendor/product committees, tool consolidation, vendor consolidation and other tactics to keep new vendor and technologies away and make do with what they already own.

Enterprise solutions are so complex and vendor messaging so confusing and ambiguous that often times you need Gartner or some other research agency to decode what is what. The number of new terms, abbreviations is just staggering. The best way to deal with complexity is ... simplicity. I like the KISS approach Keep It Simple and Stupid or Stupid and Simple. But unfortunately that is not what is happening.

On events, non-events and metrics

2008-04-21T09:49:00.000-07:00

I would like to talk about events, non-events and metrics (aka. facts). Facts are elements of truth usually expressed as name=value pair. Some examples of factual information: current_tempreature=30F, or CPU usage=30%, of course this assumes that the measurement instrument being used is accurate. When monitoring applications, systems or business services, facts are the key performance indicators that reflect the state, availability and/or performance of a given service, system or a subsystem.

So what are the events and how they are different from facts? Event is a change in state of one or more facts. A “High CPU usage” event simply means that CPU usage has exceeded a certain threshold defined by the observer. So events are just the vehicles by which changes in facts are carried from the source to the observer. Therefore most events if not all have the following common attributes {source, timestamp, variable1, variable2...., cause=other_event_list}. Timestamp is simply a time associated with the change of fact state or attribute. Example: temperature changed from 20 to 30F. One can design a event generator that creates add, removed, change events every time a fact is added, removed or changed. These events in turn can feed into a CEP or EP engine for processing.

It is also worth noting that detecting non-events should always be in the context of time, (for example non-occurrence within last 5 min or 24 hours). When the time interval expires it is easy to check for occurrence of certain events and evaluate the remaining CEP expression.

SOA World 2007 Observations

2007-11-16T10:56:00.000-08:00

I just came back from SOA World 2007 in the beautiful city of San Francisco. What a nice city, I always love visiting this charming town. But anyway, the reason for my trip was to attend SOA World 2007, so I had no time to enjoy this lovely place.

I see some really important shifts in people’s perception and adoption of SOA infrastructure. Here is my view:

Many people somehow still perceive SOA as being largely about Web Services, which was rather surprising.
The focus around SOA governance revolves primarily around deployment, policy management, and version control. Very few really focus on monitoring, especially performance and transactional monitoring, which is key for assuring the quality of service of SOA enabled applications. Most SOA governance tools are still all about Web Services. What about more advanced SOA deployments with ESB, brokers, and non WebService based environment? That is a much bigger problem to deal with and more complex indeed.
It is now widely accepted that the SOA based paradigm is not suitable for all types of problems, which is good. Previously somehow many believed that SOA is going to solve many if not all problems. I think many are disillusioned; so many projects have failed and are still continuing to fail. At the end of the day, SOA may be just one of the practices in the Architects "toolbox" to solve specific sets of problems. While I agree that most failures are not attributed to the SOA concept itself, I think the bigger issue is around people, processes, best practices and expectations.

It is interesting to see a new term like "governance" replacing a good old term like "management". In fact, what is the difference between SOA governance and SOA management? I don’t see any difference. So we have a new slew of concepts and terms, which really add very little over and above the good old terminology.

During one of the presentations I saw the heading "SOA is bigger then SOA" -- very amusing. Not sure what the author meant by that. But somehow it grabbed my attention:)

Time Synchronization and Event Processing

2007-10-28T07:19:00.000-07:00

When processing events, especially when dealing with event occurrence and sequence, time becomes an important factor. Given events A and B their time of occurrence could be tricky to compare especially when generated from multiple sources. Time would have to synchronized for both sources in order to compare time(A) and time(B) to be able to determine if time(A) < time(B), which is to say that event A occurred before event B.

But then even if we managed to synchronize time say to the millisecond or some microseconds, what happens if both events occur within or close to the resolution of the time synchronization which is to say that events occurred "simultaneously".

It would also seem that CEP processor would have to be aware of the synchronized time resolution in order to judge whether two events qualify as A < B or A <= B. A <=B would be true of the difference in occurrence is equal or less to the resolution of the time synchronization.

Another approach is to treat event occurrence to be the time of reception by the CEP processor, where all time stamps are based on the CEP time and no synchronization is required. Although, this method is highly sensitive to event discovery and delivery latencies, which is a problem in most instances.

Can CEP be used to analyze transaction flow?

2007-10-27T12:45:00.001-07:00

While Complex Event Processing (CEP) is a general purpose technology for processing events in real-time, I am wondering how it can be used to analyze transaction/message flow. The basic premise of transaction tracking within SOA and message driven architecture is to identify the flow of messages by observing message exchanges between systems and applications. In such environment messages have to be related, correlated and analyzed to determine beginning and the end of the flow, timings of each exchange as well as discover hidden relationships. Transaction boundaries are determined by observing units of work and relating them based on message exchanges. Sometime ago I published a method of correlating transactions, which describes the basic mechanics.

So far I don't see how it can be done using CEP based approach (using rule based (EPL) CEP principles). One can create a rule that observes messages (aka events), however I don't see a way to derive relationships that can later on be used to classify incoming message or events.

It seems to me that CEP would require a relationship engine of some sort that can be used to derive, store and query relationships that can be used by the CEP engine when processing events.

For example: say we observe events A, B and C. There maybe a relationship between these events. We can say events A->B a related (-> related) if A and Bs pay load contains a certain key (example order number of customer id). Lets call E(x) event E with payload x. If we observe A(x), B(x) and C(x): we can derive that A->B and B->C. If relation is transitive we can derive that A->C as well.

So it would be helpful to have a relationship service within CEP engines where once can declare a relationship and then at runtime determine whether events A and B are related and how, an if they are what types of relations qualify.

What are the key performance attributes of CEP engines

2007-10-09T16:33:00.000-07:00

CEP engines are typical implementations of a classic producer and consumer paradigm and therefore can be measured in their ability to produce and consume events. So what would be some of the metrics that we can use:

Rate of complex rules per second -- number of rules that can be processed per second
Rate of instructions per second -- since each complex rule may consist of more primitive instructions, knowing the rate of instruction execution per second may be useful.
Publishing rate per second - peak rate at which events can be published to the engine
Consumption rate per second -- peak rate at which events can be consumed by event listeners a.k.a sinks.
Event processing latency (ms)-- time it takes for event to be processes after it is published
Event delivery latency (ms) -- time it takes to deliver event after it is processed by the event processor or cain of event processors.
Outstanding event queue size -- number of events that waiting to be processed. An important measure that tell the user how many events are in the queue to be processed.

The sum of the processing and delivery latency produces the total latency to be expected by the end user. This latency can then be compared to the required quality of service or SLA for given process to determine if the processes can yield useful results as specified by the SLA.

What interests me is not only the metrics, but also the behavior in the situations when rate of production exceeds the rate of consumption for a significant period of time. In this case, the influx of incoming events would have to buffered somewhere to be processed by the engine. This of course can not go on without a significant performance degradation as well as the increase the overall processing latency.

There are several strategies that can be used separately or in combination:

Buffering -- simplest technique where events are buffered for both consumers and producers to accommodate for peaks. Eventually the buffers get exhausted and production and consumption rates must equalize by either reducing rate of production or increasing the rate of consumption
Increasing number of consumers -- this can drive the consumption rate up. However this technique suffers from the plateau effect -- meaning after a certain number the rate of consumption stalls and starts to decrease.
Dynamic throttle -- this where rate of production and consumption are throttled. The easiest place to throttle is at the event production phase, where events are actually dropped or event production is decreased via deliberate and controlled action. In this situation the latency is passed on to event producers.

The Great Battle of the Networks

2007-09-07T08:40:00.000-07:00

"The Great Battle of the Networks" is the what is happening in today IT environment. Businesses that have best most efficient networks, better intercommunications and integration among various subsystems will prevail. Todays IT networks can be compared to biologicals nervous systems.

So what is happening today: organizations are building more complex, more efficient networks. Technologies such as virtualziation, application integration, grid-computing, network and performance monitoring making these networks faster and more agile.

While business are investing into their IT infrastructure to improve their business, cut costs and maintain competitiveness I am wondering if there is a more hidden by-product of this growth -- a steady evolution of the intelligent network -- a web of self-organizing, self-healing, agile networks.

As with biological organism growth in neuro complexity led to the evolution of inteligence. While every nueron is a failry simply cell their collections produces asstounding results -- a higher order of function.

The world wide web already exhibits the some features of intelligent self-organizing systems. While you may say that "we the humans" are the once organizing. In my view the "who or what" does not matter, what matters is the final outcome.

Just like bio-organisms networks allow businesses to adapt to ever changing environment. We may be looking to the early rise of the "intelligent net". What might be interesting is that such intelligence may be beyond our senses just like each and every neuron is unaware of the higher organization it is part of.

Virtualizing Monitoring Infrastructure: Virtual CEP

2007-09-01T06:08:00.000-07:00

Virtualization offers clear advantages when it comes to storage, server and desktop virtualization. Today we can run Mac OS, Windows, Linux and others OS on a single hardware all at the same time with seamless integration and switching from one to another. Servers and storage can be consolidated easily and reduce energy, costs associated with new hardware, storage and management overhead. Benefits are clear especially for those managing complex data centers.

Virtualization is an interesting concept when applied in the area of application performance monitoring, Business Activity Monitoring and a like. When we apply concepts of CEP (Complex Event Processor), it would be nice to achieve the following:

Linear scalability with increased loads -- meaning it takes the same effort to go from 1 million rules/sec to 2 as 10 million to 20 million
Installation, deployment and reconfiguration within minutes
Unlimited processing capacity (only limited to the physical boundaries of the servers) -- meaning the number and rate of events that can be processed per unit of time.

Virtualizing CEP capabilities delivers these benefits - Virtual CEP Environment (VCE). VCE is a virtual collection of individual CEP engines pooled together as a single virtual processing environment. In this model the processing capacity of each instance can be added to the overall VCE processing capacity.
VCE can be implemented on top of virtual machines such as VMWare, XEN, Parallels -- meaning virtual machines on a single and separate hardware boxes can be pooled together to deliver processing capability.

The diagram above depicts VCE concept where 3 physical servers are aggregated into a single virtual CEP capable of processing 5.5 million rules/sec. It is easy to add more capacity by simply instantiating CEP instances either on existing box/VM or additional hardware. Instances can also be taken offline with little or no disruption.

Achieving Proactive Business Activity Monitoring (BAM) solution by combining Dashboard with CEP

2007-08-03T11:03:00.001-07:00

To some people think of BAM as a dashboard that display Key Business Performance Indicators (KBPIs). Mostly, these indicators are obtained from underlying technologies, databases, business applications and presented on the dashboard using nice looking graphs and gauges. While this typcal view is what some BAM vendors provide it falls far short of the actual business requirements:

Proactive correlation of complex interactions of key business systems
Detection of negative trends and prevention
Notifcation, proactive actions based on conditions and trends
Measuring impact of IT on key business services and the bottom line

So the dashboards provide the visualization part of the total BAM solution. The big question is how do you measure KBPIs reliably, and second how do you slice IT environment and its impact on those KBPIs. The later is a daunting task for most IT oragnizations -- trying to figure out how to obtain, measure and correlate thousands of different pieces of information to create a coherent picture of the impact on the business.

This where I beleive a solid CEP (Complex Event Processing) system might come to the rescue. Coupled with a solid data collection mechanism CEP can deliver what dashboards need in order to deliver a complete "Proactive BAM" solution.

So the case can be structured as follows: Use simple data collectors to obtain critical data required to make descisions (this could be tied to middleware, app servers, applications, business apps, existing tools), correlate information in CEP like engine to create KBPIs and integrate CEP/KBPIs with your favorite dashboard.

This process of course will only be successsful if both business and IT side are all on the same page -- meaning business requirements are well defined, and the process is well managed.

Thereore I beleive the key to any BAM implementation is combine dashboard technologies with CEP capabilities. The union of the two provides not just visualization, but cuasality, proactivity and drill down to the elements that impact your bottom line.

Java Runtime.exec() pitfalls re-examined

2007-08-03T10:17:00.000-07:00

While there is plenty of stuff written about proper usage of Java Runtime.exec(), there still seems to be a problem with the way most developers use it. First the streams provided via Process object must be drained to prevent process hang and even deadlock, second upon process termination process streams must be closed (OutputStream, InputStream and ErrorStream). However even this clean up may not prevent your JVM from running out of file descriptors.

Apparently as tested with JRE 1.4.2_13 (Linux and AIX), JVM leaves open handles dangling upon process termination even if all streams are explicitly closed. Interestingly these handles are cleaned up when System.gc() is called explictly -- so it can not be a leak in the code.

As a result repeated exections of Runtime.exec() may cause descriptor exhaustion and subsequent failures when opening sockets, files, and launching programs from within your JVM. The common error is (too many files open).

After lots of tests, calling Process.destroy() after the process ends solves the handle leak.
However you must be very careful when calling this method, since you if you still have running threads that are reading/writting Process input streams , then destroy() would make them fails on the next IO. So thread synchronization must be put in place to make sure that destoy() is called only after all stream threads have terminated.

Not sure if this problem exists on all platforms and JVMs, but the lesson is that Runtime.exec is a lot more complicated then it seems and require careful handling within your code.

Beyond SOA -- Practice Oriented Architecture (POA)

2007-06-29T14:49:00.001-07:00

I just came up with a new acronym -- POA (Practice Oriented Architecture). What is it? Well looking at SOA (which revolves around services and service orientation) it beggs the next question what is beyond SOA? I'v been thinking about it for some time.

SOA promises composite applications that closely model the way organization do business -- that is SOA promises reusable buinsess services. That is great.

In my view POA (or some variant of the acronym) will take hold soon after. What are the key elements of POA. Composition of business services into "Practices" -- goal oriented composite business processes. POA relies on a combination of proven business services that together form -- Practices. Practice is a polished set of business services that have been perfected or almost "evolved" as a result of various business activities. It also focuses on goal orientation rather then compositions. Meaning that the architecture will promote evolution of practices from so so to good to better to best. Obvious questions would be "how do you measure good, better, best"? I dont have an answer, but working on it :) -- maybe by revenue, customer loyalty, or some other key business metric.

You might say, "How is it different from BPM?" Well, in my view BPM still falls into SOA space and does not address the questions of how composite business services form into "Practices", how do you consistently improve and integrate composite business practices. Practice to me is a discipline that lets companies excel at what they do. SOA will let them integrate, POA will let them take to the next level.

The new discipline will have to focus on how to create, improve and deploy best of breed goal oriented processes that can almost self evolve.

Or may be, I just like new acronyms. :)

SOA Governance equals Web Service Management?

2007-06-20T18:57:00.000-07:00

Today I read an article from ZapThink -- "Divorcing SOA from Web Services", where the author is basically saying that many people in the industry incorrectly equate SOA with Web Services. I came across similar situations. Every time there is a talk about SOA management it is automatically being associated with Web Service management. I guess most organizations are implementing SOA using Web Services. Come to think about it, if an organization is trying implement SOA, what technologies would they use? Is Web Services the only viable way of creating service orientation? Well surely CORBA is a candidate, but is anyone actually doing this? Big part of SOA is not just service orientation but also service interoperability.

Did Web Services become a defacto standard for developing service oriented composite applications? Clearly Web Services is not enough to achieve true SOA implementation, that is where ESBs (Enterprise Service Bus) come to the picture.

The article also introduces the term SOI (Service Oriented Infrastructure) which is basically the building blocks of the SOA paradigm such as message bus, ESB etc. So organizations must effectively translate SOA into specific SOI as well as come up with best practices to support it. So the question is when do you truly achieve service orientation? I guess that would be organization specific. If SOA is about business services -- SOA implementation would mean to compose/create reusable business services that can be integrated and reused on-demand. The granularity of the service interfaces and would be organization and business centric.

So if Web Services follow "bind-publish-use" paradigm, SOA would be "bind-publish-compose-use" where each composition of services is in itself a business service can be used as independent service.
That makes sense.

ITIL and SOA -- merging best practices

2007-06-06T17:12:00.000-07:00

There is a lot of traction around in ITIL (Information Technology Infrastructure Library) and SOA (Service Oriented Architecture). While SOA provides best practices around technology integration ITIL is focused on best practices around IT Service Management (ITSM). Effective implementation of the both strategies supposed to yield capable IT infrastructure on one end and tight IT governance on the other. Well, that is true if concepts and practices can be translated to implementation easily -- not quite the case yet.

ITIL, while providing a solid base, in practice is difficult to achieve on the end-to-end basis. So organizations pick tools that fit into the ITIL model -- yet may not readily integrate with one another. The result is organizations that on paper are ITIL compliant yet in reality can not fully realize ITIL vision.

So how should deal with strategic long term direction and yet deal with short term practical issues of SOA governance (using ITSM)?

Organization need to adapt ITIL practices to fit the size and requirements (current and future)
Technology aquisition and adoption must always be evaluated against: todays needs and always must be inline with long term strategic goals.

While it sounds quite obvious, it is not easy to enforce especially in large organizations. What is happening is that long term strategy is often times traded for the short benefit often on project by project basis. This in term creates a complex technology and tool mix that is 1) complex and 2) costly to maintain going forward.

Some organizations setup archiecture committees or review boards, which review and approve all tehnology aquisitions and adoption within the organization. Their tasked one to ensure that best practices are enforced on end, and also review and ammend the same practices on the other.

The role of such review boards is essential especially when organizations are trying to implement such broad practices such as ITIL and/or SOA.

SOA - yet another layer of complexity

2007-05-29T16:15:00.000-07:00

In my view introduction of SOA will push IT complexity beyond a point that can be effectively managed by todays management infrastructure. Why do I say that? Several reasons:

SOA based applications will require management tools that will need to adapt to changing requirements much faster then can be accomplished using todays management tools
SOA based applications would be able to connect a vast array of disintegrated processes together creating a level of of interconnectedness that can no longer be handled by event based health monitors.
The number of tools and technologies required just to manage and a SOA environment in itself creates a management and complexity overhead.
Complexity simply will grow exponentially if not managed properly or methodically reduced and simplified.

So SOA infrastructure has a potential of bringing a complex environment that is out of control and much more difficult to manage then typical 3 tier, client/server or monolithic applications. So how do we deal with the benefits on one end and growing complexity on the other? Well I believe there is only 1 way to do this -- investment in SOA governance tools and practices. This means not only technologies, but also skills, processes and practices. There are several methods that look very attractive:

Virtualization -- one way to deal with complexity is try to simplify the environment -- do more with less -- less hardware, less network, less software. Less is more very often.
Create and Maintain Capability Matrix -- rather then tools matrix and map tools you have to capabilities. Obviously you need to have all critical capabilities fulfilled by at least one technology or tool. Duplicates can be eliminated. Think in terms of technology capabilities rather then tools and technologies. Use this matrix to simplify your environment.
Use KISS approach -- I like this approach since my college years. Keep It Stupid and Simple. Apply it in every possible aspect. Complicated setup are hard to maintain and manage.

If left unchecked the benefits of SOA (and there are many) will be eroded by increased maintenance, degradation of quality of service , loss of revenue and customer loyalty. So implementing a successful SOA environment is about well balanced growth -- focusing not just on adding capabilities but also looking into how to reduce the complexity.

So what is Enterprise Service Bus?

2007-05-25T15:20:00.000-07:00

It seems like there is a confusion around the meaning of Enterprise Service Bus (ESB). To some ESB is a specific product offering from vendors such as IBM, Oracle, BEA etc. To others ESB is a set of technologies such as messaging, application server, web services, data services integrated together -- basically a technology fabric that lets users unlock the value of SOA. I tend to a agree with the later definition.

ESB is a specific implementation of SOA and would have to be combination of various technologies integrated using SOA paradigms and standards.

Once we have the definition right, we can start asking ourselves questions: "What does it mean to manage an ESB?" In my view, based on the later definition of the term ESB, it means that all components of ESB must be managed in a single inform fashion -- not just an individual product offering. "How do we measure the quality of service provided by ESB?" This is a much more complex question to answer and not just due to the sheer complexity of technologies involved but also due to the different perceptions/definitions of what QOS actually means for different users and stakeholders of the SOA/ESB environment.
At any rate, proper definition of ESB is essential before more complex questions can be asked and answered.

Essential building blocks of a sound SOA environment

2007-05-18T09:13:00.000-07:00

Many SOA implementations are evolutionary projects, usually starting with application integration and then moving up the food chain -- web services, ESB, business service and workflow management. Few shops have actually implemeted all layers of SOA, but works under way. Below are the basic building blocks for successfull SOA implementation:

Service Registry (UDDI)
Common Message Bus (JMS, MQ, others)
Common Application Service Platform (Application Servers)
Common Service Orchestration and collaboration engine (Enterprise Service Bus)
Data transformation and integration services (Message borkers)
Common Management and Monitoring Services

Obviosly these technology peices must be complemented by proper orghanization and skillset to put into and effective enterprise integration fabric. So SOA is not just about technology but also about people, processes and best practices to make it successful. And that is why SOA can not be bought or aquired like other products. It must be built, nurtured and refined every step of the way. There is a term that sums it all up -- SOA governance.

Improving performance of Java applications

2007-05-18T08:14:00.001-07:00

During many implementation of high performance java applications, I found that the following strategies lead to be better performaning java apps -- especially better performing server apps.

Limit the number objects instanciations. This might actually be difficult to do, however with the right design and approach tremendous saving are possible. Object pools is one way to go, but may not be applicable in every situation.
Avoid or improve object serialization if possible. Object serialization (accross networks) happens to be very slow. It offers ease of use, but you pay in performance. Analyze the classes that are serialized and make sure only required attributes are serialized. Hashtables and complex structures are the biggest hit.
Analyze and optimize string related operations -- especially concatentations, string buffer extensions are usually expensive.
More threads does not mean better throughput -- thread management adds overhead and after a certain number may actually degrade performance. Multiple CPUs might add marginal improvement especially when threads are dependent and sychronized with other threads. Revise your thread model to take advantage of multiple CPUs.
Avoid synchronous IO -- using asynchronous socket IO, file IO is prefered, let the worker threads to the job. I happen to like the principle of accepting the client request, and let it process by the worker thread and unblock the client app -- then notify the client when task completes.

Is SOA all that?

2007-05-17T19:32:00.000-07:00

Service Oriented Architecture seems to be the new buzz word everywhere!!! SOA will solve all the problems, everyone wants SOA -- or so I hear. Anyway, I remember similar claims when CORBA was coming to town. It was supposed to change the way IT does business. To me SOA is XML/SOAP based CORBA with evolving standards around it... I might be wrong.

Yet here we are and very little has changed. Well, maybe not, one thing is for sure -- IT complexity is growing at an alarming rate and I am wondering -- how much complexity can we a take until we loose control. IT shops take this concern seriously and spend big bucks trying to manage this complexity. I have a question --- Will introduction of SOA help IT better manage complexity and reduce cost or will it fail just like its predecessor? So far I see lots of traction in the field, but also a lot of skepticism.

What I do know for sure, is that organizations have to find new ways of leveraging technology and turn it into a competitive advantage. SOA is an experiment. I like experiments it keeps us busy and in business.