Friday, November 28, 2008

2008-11-28 Friday - Linux based SOA Demo Platform

Tonight I downloaded VMware Player 2.5.1 - which is an update over the previous version 2.0 that I had installed.

I also have downloaded the Fedora 10 release of Linux. (Release Notes)

Tonight I'm starting the process of creating a VMWare image that will be an integrated demonstration platform of a number of Open Source SOA tools - here are a few of the things I plan to configure as the base:


  • Glassfish ESB

  • Apache Synapse

  • JBoss Drools

  • Mule Galaxy

  • MySQL

  • memcachd



I'm experimenting with the Microsoft Virtual PC 2007 SP1 virtual machine software - since the VMWare Player does not allow for the creation of a new virtual machine.

2008-11-29 Update

Fedora 10 Inside Windows: Screenshots Tour

EasyVMX!: Create Virtual Machine
Super Simple Edition

2008-11-28 Friday - XML Schema Versioning

I cam across the following document today:


Evaluating a Service-Oriented Architecture
Phil Bianco, Software Engineering Institute
Rick Kotermanski, Summa Technologies
Paulo Merson, Software Engineering Institute
September 2007
TECHNICAL REPORT
CMU/SEI-2007-TR-015
ESC-TR-2007-015
Software Architecture Technology Initiative
http://www.sei.cmu.edu/pub/documents/07.reports/07tr015.pdf

See page-44

5.12 WHAT IS THE APPROACH FOR SERVICE VERSIONING?



...which points to an external document in the Refernces section:



[xFront 2007]

xFront. XML Schema Versioning.

http://www.xfront.com/Versioning.pdf


...which is a 5-page pdf discussion of this topic.


A closely related topic is the issue of WSDL versioning - which is covered quite well in an Iona Best Practices white paper: http://blogs.iona.com/sos/20070410-WSDL-Versioning-Best-Practise.pdf

Thursday, November 27, 2008

2008-11-27 Thursday - Writer Tools

I am starting to do some writing for an online magazine (Developer.com) - and found the need to do some screen captures for the first article that I'm writing.

Windws Screen Capture Frustation

After trial-and-error and more than a wee bit of frustration at the limitations of using the built-in capabilities of Windows to capture screen images (e.g. hit the ALT-Print-Screen button, and then copy from the clipboard into Microsoft Paint) - I decided to look for a better tool more suited to the task.

Why can't it just be easy?


A Craftsman Uses the Proper Tool

Many, many, years ago I used a tool called SnagIt - and remember how well suited it was to the task. A quick Google search landed me on the TechSmith web site - where I was able to download a 30-day evaluation copy of SnagIt 9.0.2

Within about 10 minutes I was happily capturing images, doing various transformations of the image, and saving them in a variety of popular formats (and some not so popular).


Highly Recommended:



2008-11-27 Thursday - GIS Books

I've created a GIS category in my Amazon recommendations bookstore.

I've also created a short-list of GIS resource links.

Here are a few of the selections that look interesting or are Open Source related:





Desktop GIS: Mapping the Planet with Open Source Tools
by Gary Sherman


Desktop GIS explores the world of Open Source GIS software and provides a guide to navigate the many options available. Discover what kind of GIS user you are and lay the foundation to evaluate the options and decide what software is best for you.
Desktop GIS examines the challenges associated with assembling and using an OSGIS toolkit. You'll find strategies for choosing a platform, selecting the right tools, integration, managing change, and getting support. The survey of OSGIS desktop applications provides you with a quick introduction to the many packages available. You'll see examples of both GUI (Graphical User Interface) and command line interfaces to give you a feel for what is available.

This book will give you an understanding of the Open Source GIS landscape, along with a detailed look at the major desktop applications, including GRASS, Quantum GIS, uDig, spatial databases, GMT, and other command line tools. Finally, the book exposes you to scripting in the OSGIS world, using Python, shell, and other languages to visualize, digitize, and analyze your data.










GIS for Web Developers: Adding 'Where' to Your Web Applications
by Scott Davis


GIS for Web Developers introduces Geographic Information Systems (GIS) in simple terms and demonstrates hands-on uses. With this book, you'll explore popular websites like maps.google.com, see the technologies they use, and learn how to create your own. Written with the usual Pragmatic Bookshelf humor and real-world experience, GIS for Web Developers makes geographic programming concepts accessible to the common developer.

This book will demystify GIS and show you how to make GIS work for you. You'll learn the buzzwords and explore ways to geographically-enable your own applications. GIS is not a fundamentally difficult domain, but there is a barrier to entry because of the industry jargon. This book will show you how to "walk the walk" and "talk the talk" of a geographer.

You'll learn how to find the vast amounts of free geographic data that's out there and how to bring it all together. Although this data is free, it's scattered across the web on a variety of different sites, in a variety of incompatible formats. You'll see how to convert it among several popular formats including plain text, ESRI Shapefiles, and Geography Markup Language (GML).










The KML Handbook: Geographic Visualization for the Web
by Josie Wernecke
KML began as the file format for Google Earth, but it has evolved into a full-fledged international standard for describing any geographic content—the “HTML of geography.” It’s already supported by applications ranging from Microsoft Virtual Earth and NASA WorldWind to Photoshop and AutoCAD. You can do amazing things with KML, and this book will show you how, using practical examples drawn from today’s best online mapping applications.









Beginning MapServer: Open Source GIS Development
by Bill Kropla


Beginning MapServer: Open Source GIS Development...offers a comprehensive introduction to MapServer, the development platform for integrating mapping technology into Internet applications. You'll learn how to build and extend dynamic applications using popular languages like PHP, Perl, and Python.

After a thorough introduction to installation and configuration, you'll uncover basic MapServer topics and examples. You'll also learn about advanced MapServer features, and how to query and incorporate dynamic data into your application. The book culminates with the creation of an actual mapping application.










Open Source GIS: A GRASS GIS Approach
by Markus Neteler, Helena Mitasova


Thoroughly updated with material related to the GRASS6, the third edition includes new sections on attribute database management and SQL support, vector networks analysis, lidar data processing and new graphical user interfaces. All chapters were updated with numerous practical examples using the first release of a comprehensive, state-of-the-art geospatial data set.

Tuesday, November 25, 2008

2008-11-25 Tuesday - JBoss ISV Program

Press Release: Red Hat Expands JBoss Certified ISV Program With 250 Partners

For more information about the JBoss Certified ISV Program:
http://www.jboss.com/partners

2008-11-25 Tuesday - Book Review: Business Process Management with JBoss jBPM

Book Review: Business Process Management with JBoss jBPM
(full disclosure: I was provided a free copy to review)




If you are a manager and need a good book to help you sell the idea of BPM to upper-management - and explain the value (and the practical application) of BPM tooling to solving real-world problems, then this book is a very good entry-level text - with a concise and practical approach to walking you through the current best practices in this technology space.

Here's a quick chapter overview of what's covered in the book:

Chapter 1: Introduction
- BPM approach to software development

Chapter 2: Understanding the target process
- Setting up the project
- Analyzing the process

Chapter 3: Develop the process in JBoss jBPM
- jBPM architecture
- Installation
- jBPM concepts
- Buliding the example process

Chapter 4: The Prototype user interface
- Build the prototype
- Investigating the web console interface
- Adapt the web console

Chapter 5: Iterate the prototype
- Set up for the proof of concept
- Iterate the sysstem

Chapter 6: Proof-of-concept to implementation
- Preparation for implementation
- Monitoring the process

Chapter 7: Ongoing process improvement
- Project assessment
- Process analysis and improvement
- Business process documentation
- Ideas for further development

On November 24, 2008, the Drools team posted a video presentation:

Whats new in Drools 5 video and Q&A session
http://blog.athico.com/2008/11/whats-new-in-drools-5-video-and-q.html

There are some very exciting new developments in JBoss Drools - and this book is still a good text for communicating the process and concepts of JBM development.

Monday, November 24, 2008

2008-11-24 Monday

Esther Schindler (Senior Online Editor at CXO Media / CIO Magazine) posted a question on LinkedIn a few months ago that I just came across: What are the burning issues in SOA (Service Oriented Architecture)?

There were 21 responses...some of which were quite interesting.

Sunday, November 23, 2008

2008-11-23 Sunday

soapUI 2.5, The REST Release is out

NetBeans IDE 6.5 Available for Download

Zviki Cohen's "Either You Succeed or Explain" blog
Eclipse 3.4 Hidden Treasures

Five ways for tracing Java execution

2008-11-21 Friday - QCon 2008 Afternoon

Disclaimer: There may be some many mistakes/errors in my blog notes for QCon over the next few days - as I am writing these posts while I'm sitting in sessions - and much of these notes are stream-of-consciousness - as I try to keep up with the presenters.

I'm in San Francisco attending QCon 2008 this week.




Bringing the enterprise to the web with Mule
Dan Diephouse, Architect, MuleSource


Research: SideNote.net (?)

No prescribed message format

zero code intrusion
- mule does not impose an API on service objects
- objects are fully portable

Existing objects can be managed

Easy to test

1) Declare a service
2) Define inbound router
3) Define Outbound router

Used to connect components and external systems together

Endpoints use a URI for addressing

Inbound Routers
- Idempotency
- selective consumers
- re-sequencing
- messag aggreagation

Outbound Routers
- message splitting


Core Concepts: Transformers
- converts data from one format to another
- can be chained together to form transformation pipielines

Building Services
- Annotations to expose your classes as RESTful service
- implements JAX ...?


Jersey - JAX-RS implementation
- simple annotations to create a RESTful web service


Integrating into your Messaging Layer

You can modify messages while filtering

ATOMPUB
- service
- workspace
-

ABDERA (collection adapters)
- ???
- JCR
- JDBC
- FileSystem

jcrAdapter allows instant creation of an Atom store
allows posting to the AtomPub log


Creating Atom Transformers

*** Research RSS Bandit ***

Chris Barry - research Atom Server usage
- used for vacation reservations
- used by Google?

Polling vs. Messaging

HTTP has a polling connector builtin - that supports eTags (*** need to research eTags ***)

Mule Books
--------------------
Mule In Action
Open ESB
???


There is an open debate in the REST community about what might be the best practice for communicating
the schemas for RESTful services.

WADL - Mark Hadley, WSDL for REST

Return the schema if the request is only a Head

Galaxy supports a feature called Netboot - which can facilitate deploying / synchronizing config files / components (?) - and JMX can be used to communicate an event to the servers that need to receive the update.








Ning
Jay Parikh, SVP of Engineering & Operations

AskANinja.com
TwitterMoms
Hoffspace
TheListProject.org
DIY Drones (robotics)
PickensPlan

Constraints:
- each network is different
- applications from scratch
- features built as platform APIs
- deny abuse of APIs
- support backwards compatibility
- scale BOTH individual applications and the platforms as a whole

Original Tools Used
===================================
Nagios
Splunk
Zenoss
Cacti
Keynote
EM
DDFM
InformMC
Cohesion
JMX
Segue
Loads of log files
Network gear (F5, Cisco,

Reboot
----------------------------------
Reduce MTTD and MTTR
more flexible monitoring
higher resolution
consolidate
self-service
unified infrastructure


Real-time Monitoring Overview


Visibility Challenges
---------------------------------
Real-time monitoring overview
- Erosebo (?) - custom built?
- Monitoring agent
- Archive to Hadoop
- Dashboard / Collector, aggregated event stream

2008-11-21 Friday - QCon 2008 Morning

Disclaimer: There may be some many mistakes/errors in my blog notes for QCon over the next few days - as I am writing these posts while I'm sitting in sessions - and much of these notes are stream-of-consciousness - as I try to keep up with the presenters.

I'm in San Francisco attending QCon 2008 this week.




9AM-
Social Architectures

MySpace presentation
Dan Farino, Chief Systems Architect, MySpace.com



__The traditional distribution channels are being redefined...__
SMB Channels are out pacing large channels

Transacting within the community being redefined

Small producers can now compete with the large industries


__Social Fuzziness__
- the boundaries fo the social sites aree not as clear sa the preceeding slides suggest
-- myspace and facebook transactd
-- ebay has a thriving community
-- digg and ing are about networking


__Social Architecture Challenges__
- worry not that no one knows of you - seek to be worth knowing -- confucious

Shared joy is a double joy - shared sorrow is half a sorrow - swedish proverb

connections between people are transitive and lack affinity

partitioning for scale is always problematic


Tools are sparse for large-scale windows server management

Traditional Plan:
- Plan
- Implement
- Test
- Go Live
- Manage growth


MySpace plan as executed
- implement
- go live

result:
- reboot servers - often
- "Shotgun debbugging"
-- a process of making relatively undirected changes to software in the hope that a bug will be perturbed out of existence.
-- need to resolve the problem now and collecting data for analysis would take too long


Windows 2000 Server
IIS

Operationally - where they were...
- batch files and robocopy for code deployment
- "psecec" for remote admin script execution
- Windows Performance Monitor for monitoring

No formal, automaoted QA process



Current Architecture
- 4,500 web serves, windows 2003 IIS 6.0 ASP.NET
- 1,200 "cache" servers - 64-bit windows, 2003 (key value pair, distributed)
- 500+ database servers


QA Today
- unit tests/automated testing
- don't fuzz the site nearly as thoroughly as the users do
- there are still problems that happen only in production

Ops Data Collection
Two types of systems:
- Static: collect, store, and alert based on pre-configured rules (e.g. Nagios)
- Dynamic: Write an ad-hoc script or application to collect data for an immediate or one-off need


(Very interesting: Windows Performance counter monitor display in their Ops Data Collection)

Cons of static system:
- relatively central configuration managed by a small number of administrators
- bad for one-off requests: change he config, apply, wait for data
- developer's questions usually go unanswered

Devlopers like to see their creations come to life

Cons of the dynamic system:
- it's not really a "system" at all - its an administrator running a script
- is a privileged operation: scripts are powerful and can be potentially make changes to the system
- even run as a limited userr, bad scripts can still DoS the system
- one-shot data collection is possible but learning about deltas takes a lot more code (and polling, yuck)
- different custom-data collection tools that request the same data point caused duplicated network traffic

*** They use Powershell a lot ***

Ideally, all operational data avaialble in the entire server farm should be queryable

New Operational Data-subscription platform for Ops Data Collection
- on-demand
- supports both "one-shot" and "persistent" modes
- can be used by non-privileged users
- a client makes __one__ TCP connection to a "Collector" server
-- can receive data related to thousands of servers via this one connection
-- like having all of the servers in a Jabber chat room and being able to talk to a selected subset of them at any time (over __one__ connection)



Agents provide...
- Windows Performance Counters
- WMI Objects
-- event logs
-- hardware data
-- custom WMI objects published from out-of-process
- Log file contents

On Linux, plans are to hook into something like D-Bus...

All C#, asynch I/O - never blocks a thread
Uses MS "Concurrency and Coordination Runtime"
Agent runs on each host
Wire protocol is Google's Protocol Buffers
Clients and Agents can be easily writtten to the Agents wanted to see if C#+CCR could handle the load (yes it can)

Why develop something new?
- there doesn't seem to be anything out there right now that fits the need
- requirements include free and open source

To do it properly - you really need to be using 100% async I/O
Libraries that make this easy are relatively new
- CCR
- Twisted
- GTask ==> Need to research this (for Linux asynch callback processing)
- Erlang

What does this enable?
- the individual interested in the data can gather it theirself.
- its almost like exploring a database with some ad-hoc SQL queries
- "I wonder"...questions are easily answered without a lot of work
- charting/alerting/data-archiving systems no longer concern themselves with the data-collection intricacies
- we can spend time writing the valuable code instead of rewriting the same plumbing every time
- abstract physical server-farm from teh user
- if you know machine names, great - but you can also say "all servers serving..."
- guaranteed to keep you up to date
- get your initial set of data and then just wait for the deltas
- pushes polling as close to the source as possible
- eliminates duplicate requests
- hundreds of clients can be monitoring the "% processor time"
- only collects data that someone is currently asking for

Is this a good way to do it?
- having too much data pushed at you is a bad thing
- being able to pull from a large selection of data points is a good thing

For developers, knowing they will have access to instrumentation data even in production encourages more detailed instrumentation.


Easy and fun API's

Using LINQ: Language Integrated Query
LINQ via C# and CLINQ ("Continuous LINQ") = instant monitoring app (in about 10 lines of code)

var counters = ...
MainWpdfWindows.MainGrid = counters;
// go grab a beer

Tail a file across thousands of servers
- with filtering expression being run on the remote machines
- at the same time as someone else is (with no duplicate lines being sent over the network)
- multicase only to the people that are subscribed (?)

Open Source it?
- hopefully

Other implementations?
- may write a GTask / Erlang implementation









2008-11-21 Friday

10:45AM

Digg: An Infrastructure in Transition
Joe Stump, Lead Architect, Digg


35,000,000 unique
3,500,000 users
15,000 requests/sec
hundres of servers

"Web 2.0 sucks (for scaling)" - Joe Stump

What's Scaling?
- specialization
- severe hair loss
- isn't something you do by yourself

What's Performance?
- who cares?

Not necessarily concerned with how quick - but can they store everything they need - and return it in a reasonable amount of time.

Clusters of databases are designated as WRITE - others are READ

Server Tiers
- Netscalers
- Applications
- Lucene (for search)
- Recommendation Engine (graph database)
- MogileFS - distributed web-dav store
- Database servers clusters - which serve different parts of the site (LULX, AFK, ZOMG, WTF, ROFL)
- IDDB

A normal relationship database doesn't work well for certains types of views into your data

MySQL has problems under high write-load

Messaging framework
- XMPP? - stateless - can't go back in time
- Conveyor (allows rewind of data?)

IDDB
- elastic horizontal partitions
- heterogenous partition types
- multi-homed
- IDs live in multiple places
- partitioned result sets

MemcacheDB
- Memached + BDB
-- supports replication, keyscans
- 28,000+ writes per second
- Persistent key/value storage
- Works with Memcached clients
- used in areas where de-normalization would require more writes than MySQL can handle

War Stories...
- Digg Images
-- 15,000 - 17,000 submissions per day
-- crawl for images, video, embeds, source, and other meta data
-- ran in parallel via Gearman <== Need to research this...


- Green Badges
-- 230,000 diggs per day
-- most active Diggers are also most followed
-- 3,000 writes per second
-- ran in background via Gearman
-- Eventually consistent

2008-11-20 Thursday - QCon 2008 Afternoon

Disclaimer: There may be some many mistakes/errors in my blog notes for QCon over the next few days - as I am writing these posts while I'm sitting in sessions - and much of these notes are stream-of-consciousness - as I try to keep up with the presenters.

I'm in San Francisco attending QCon 2008 this week.




5:15pm

Designing Enterprise IT Systems with REST: A (Cloudy) Case Study
presenter: Stuart Charlton, Chief Software Architect, Elastra

Web architecture helps to burst silos

Classical "Good SOA" interfaces

FEA - large dictionary for Canonical Semantics


Conway's Law: http://en.wikipedia.org/wiki/Conway%27s_law

"...organizations which design systems ... are constrained to produce designs which are copies of the communication structures of these organizations"

"Any piece of software reflects the organizational structure that produced it"

"It is a consequence of the fact that two software modules A and B cannot interface correctly with each other unless the designer and implementer of A communicates with the designer and implementer of B. Thus the interface structure of a software system necessarily will show a congruence with the social structure of the organization that produced it."



The Hypermedia Alternative
- create a document that describes a "state machine"

Problem Domain
(Elastra is a cloud computing vendor)
- IT Services Management & Provisioning

Architectural & Change Considerations
Designs: Architectural Views
- Lifecycle: birth, growth, failure, recovery, death


Organizational & Geographic Distribution

Governance Fallacy:
...Federation: there is a chief at the top that has the will and the control authority

versus

...Confederation

The Decentralized, Declarative Data Center

You don't need a registry in a RESTful collection of services - by definition, a REST request is self-describing (?)

Graph of Information and Interfaces

The agent (browser) surfs the graph of information and interfaces -- nodes of which may expose Dynamic Interfces - which results in a new or changed bit of information (and may modify the graph of information and interfaces).

Hypermedia is a mix of data and control

Data Out (GET)
Data In (PUT)
Interface Out (GET)
Process Something (POST)

==> returns Response Codes


What's a Dynamic Interface?
- Interaction port that is bound at runtime
-- CORBA Dynamic Invocation interface
-- java.lang.Reflect
-- capability to negotiate (e.g. TELNET)

- Agent matches what they know to what's available

- The Big Ddifference? Metadata over Methods
-- The semantics are in the context of the link

How can I describe my interfaces?
-Tightly Couples
-- XML Schema Definition with minOccurs > 0

- Looser Coupled
-- Dynamically generated XML Schema Definitions
-- Edit Link Relations (e.g. AtomPub Media Entries)
-- Forms (e.g. XForms, HTML)
-- Annotate each field with a Persisent URL


What about Versioning and Provenance?
- "Metabase" Intermediary
-- Annotation
---- Collections, Search, SPARQL Query
---- Shredded historical representations

Tooling is a bit sparse...

Security: Federated Identity
- SAML (very robust in a Java world) - complex
- WS-Federation (for Microsoft integration)
- OpenID (mind the phishing)
- Point-to-Point (sadly)

- OAuth has promise but is very young - the current flavor for RESTful implementations

Towards the Semantic Web
- Its not crazy - its just
-- layering logic on top of the web (an Open-World RDBMS)
-- enabling querying and mashing of web pages without neurosurgery

- SPARQL ("sparkle") is very big win for RESTful implementations
-- query database or the web of hypermedia
-- same syntax - nothing changes
-- declarative integrity enforcement for PUT and POST

- RDFa and GRDDL are easy to use (microformats?)
-- just annotate your HTML or write your own XSLT

Semantic Web Client Library - Query the web
-- http://www4.wiwiss.fu-berlin.de/bizer/ng4j/semwebclient


To Research: RDF serialization of JSON

Thursday, November 20, 2008

2008-11-20 Thursday - QCon 2008 Morning

I'm in San Francisco attending QCon 2008 this week.

Disclaimer: There may be some many mistakes/errors in my blog notes for QCon over the next few days - as I am writing these posts while I'm sitting in sessions - and much of these notes are stream-of-consciousness - as I try to keep up with the presenters. I will attempt to clarify/correct any errors within the week.




9AM...
The morning keynote speaker is Tim Bray.

Tim Bray's morning keynote

MemCacheD - distributed hash table
(is there a .NET plugin available???)
- used by facebook, wikipedia
danga.com/memcached

tbray.org/ongoing

ruby.gemstone.com

MagLev: "Ruby that scales"

Drizzle - a lightweight SQL Database for Cloud and Web computing
fork of MySQL


Apache CouchDB - restful HTTP accessed database
(database is implemented as JSON name/value pairs)
- distributed, fault-tolerant and schema-free document-oriented databaes accessible
via a RESTful HTTP/JSON API. Provides a robust, incremental replication with bi-directional
conflict detectdion and resolution, and is queryable and indexable using a table-oriented view engine
with JavaScript acting as the default view definition language.
- Implemented in Erlang
- incubator.apache.org/couchdb

AtomPub (REST) RC-123 (?) - built by Tim Bray

Bonnie, developed by Tim Bray
- applications need to read/write files
- sometimes they use smart I/O - somethimes they process characters and rely on buffering libraries
- sometimes they need to update files in place
- sometimes they need to seek around


Sun 7410 server has some very interesting performance characteristics for I/O

presentation slides
tbray.com/tmp/QCSF08.pdf







Thursday, November 20, 2008 - morning sessions

DSLs in Practice

Jay Fields, DRW Trading
5Ws of DSLs

Domain Specific Language
- a computer programming language of limited expressiveness focused on

- a DSL supports a bare minimum

examples of DSLs: sql, regular expressions, spring config, linq

Prgrammerr Read/Write DSLs
-JMock
-Mockito
-Active Record

Domain Expert Readable
-RSpec
-Your domain model


-ignore programming best practices

-language noise should not exist
-domain experts design language


DSLs to research: JBehave, JQuery, RSpec Scenarios, Mockito, Rake, Rhino MOcks, db deploy, Prototype Effects, YUI widgets, Thrift,

Seamless resource pub/sub

Wednesday, November 19, 2008

2008-11-19 Wednesday - QCon 2008 Day-1 conclusions

QCon 2008 Day-1 impressions and conclusions

I have found Mecca...the Holy Land.

So often I've paid to attend a conference - and felt that there were spots of "goodness" in the sessions that typically stretch on throughout a long day/week - but rarely felt that I was really getting my money's worth for the not insignificant time and expense that I sacrificed to travel and to attend a conference.

Not so today. Every session I have attended today has been right on target with the particular interests I have as an enterprise architect.

The speakers assembled for QCon 2008 are top-shelf, world class, industry recognized heavy-weights.

I look around the conference session meeting room - and during lunch today - and in every conversation that I have overheard or been engaged in - I have found thoughtful people that have similiar concerns and interests relative to the topic of enterprise architecture.

2008-11-19 Wednesday - QCon 2008 Day-1 afternoon

Disclaimer: There may be some many mistakes/errors in my blog notes for QCon over the next few days - as I am writing these posts while I'm sitting in sessions - and much of these notes are stream-of-consciousness - as I try to keep up with the presenters. I will attempt to clarify/correct any errors within the week.




I'm in San Francisco attending QCon 2008 this week.


Session: Golden Rules to Improve Your Architecture, 1PM-2PM

Alexander v. Zitzewitz
hello2morrow, Inc

created hello2morrow in 2005

"The Dragon of Complexity"
- invisible
- snatches you from behind
- your enemy


"You can never kill complexity"

"The friend of the dragon is the Law of Entropy"

Erosion of architecture - a fundamental law?

Architecture erosion is quote a known problem
- System knowledge and skills are not evenly distributed
- Complexity grows faster than size
- Unwanted dependencies are created without being noticed
- Coupling and complexity are growing quickly

Typical symptoms of an eroded architecture are a high degrees of coupling and a lot of cyclic dependencies
- changes become increasingly difficult
- testing and code comprehension become more difficult
- deployment problems of all kinds

"The Truth can only be found in the code"

Mapping of physical elements to logical elements
- Each package is mapped to exactly one subsystem
- If package's contain types of several subsystems, virtual refactorings are helpful
- a good naming convention for package's can make your life every simple
-- com.company.projectName.verticleSlice.layer
- Subsystems should have interfaces
- Work incrementally
-- start with your layering
-- then add the vertical slices (if applicable)
-- define subsystem interfaces
-- fine tune the rules of engagement on the subsystem level

How to measure coupling
- ACD = average component dependency
- average number of direct and indirect dependencies
- rACD = ACD / number of elements
- NCCD: normalized accumulated component dependency

(from the large-scale C++ design book)

__BIG IDEA__
Robert C. Martin: Dependency Inversion Principle
(invert dependencies, add interfaces)

Spring is a big abstract factory pattern

Average component dependency should be below 7% - heuristic target

Spring has no cyclic dependencies greater than +1

__Cyclic dependencies are evil__
- No cycles between packages
- the dependencies between packages must not form cycles
- cyclic physical dependencies among components inhibit understanding, testing, and reuse.

Introducing interfaces allows you to break cyclic dependencies

__Six Golden Rules for a successful project__
Rule 1: define a cycle free logical architecture down to the level of subsystems and a strict and consistent package naming convention

Rule 2: do not allow cyclic dependencies between packages

Rule 3: keep the relative ACD low (<7% for 500 compilation units, NCCD < 6)

Rule 4: limit the size of java files (700 LOC is a reasonable value)

Rule 5: limit the cyclomatic complexity of methods (e.g. 15)

Rule 6: limit the size of a java package (e.g. less than 50 types)

Tools
- JDepend





Neal Ford's presentation: 10 Ways to Improve Your Code, 3:45PM

#1 - Composed Method
Divide your program into methods that perform one identifieable task

refactoring to composed method


__Benefits of composed Method__
method names become documentation
large number of very cohesive methods
discover reusable assets that you didn't know were there



#2 - Test-Driven Development (TDD)
(also Test-Driven Design)

__Benefits of TDD__
- first consumer


#3 - Static Code Analysis

- byte-code analysis: findbugs


#4 - Good Citizenship

singleton is bad because it mixes responsibilities, untestable, the object version of global variables

avoiding singletons
- create pojo for the business behavior (simple, testable)
- create a factory to create one-and-only-one


#5 - yagni
(agile term)
You-ain't-going-to-need-it

build the simplest thing that we need right now

don't indulge in speculative development

increases software entropy

only saves time if you can guarantee you won't have to change it later

if features are weight - then anticipatory design decreases the velocity of the rate of change that is sustainable in a code base.


Top 10 Corporate Code Smells
10 - we invented our own web/persistence/messaging/caching framework because none of the existing ones were good enough

9 - we bought the entire tool suite (even though we only needed about 10% of it) because it was cheaper than buying the individual tools.

8 - We use WebSphere because...(I always stop listening at this point)

7 - We can't use any open source code because our lawyers say we can't

6 - We have an Architect who reviews all code pre-checkin and decides whether or not to allow it into version control

5 - The only JavaDoc is the eclipse message explaining how to change your default JavaDoc template

4 - We keep all of our business logic in stored procedures..for performance reasons

3 - We don't have time to write unit tests (we're spending too much time debugging)

2 - The initial estimate must be within 15% of the final cost, the post-analysis estimate must be within 10%, and the post-design estimate must be within 5%

1 - There is a reason WSAD isn't called WHAPPY


#6 - Question Authority

test names - use underscores (not Camel Case)

non-intuitive

pair-programming studies - pairs produce code 15% slower after adjustment, but with 15% fewer defects


#7 - Single Level off Abstraction Principle (slap)
everything should be at the same level of abstraction
jumping abstraction levels makes code hard to understand / maintain


#8 - Polyglot programming
leveraging existing platforms with languages targeted at specific problems and applications

looming problems/opportunities
- massively parallel threading
-- use a functional language:jaskell, scala (inherently thread save, immutable)
- schedule pressure
-- jruby on rails
- writing more declarative code via dsls
- build fluent interfaces
-- swiby:jruby + swing
--- (swiby is a DSL for UI programming)

"Java will become the glue language"


#9 - Learn Every Nuance of your chosen language/tool

- java's back alleys
-- reflection

- regular expressions


#10 - anti-objects
OOPSLA 2006 - collaborative diffusion presentation

"The metaphor of objects can go too far by making us to create objects that are too much inspired by the real world"



Dr. Dean Wampler's presentation: Radical Simplification Through Polyglot and Poly-Paradigm Programming, 5:15PM

Simplification of code is necessary

Ola Bini's Three Layers

Functional Programming avoids multi-threading/concurrency issues

Functional programming: Declarative rather than imperative

DSLs tend to be very declarative

Examples of Functional Languages:
- Erlang
-- 9-9's reliability for AXD301 switch
-- for distributed, reliable, soft-time
-- All IPC is optimized messaging passing
-- very leightweight and fast processes

-Scala
-- hybrid language (object and functional)
-- targets the jvm
-- "endorsed" by James Gosling at JavaOne
-- could be a popular repalcement for Java

"Is a hybrid object-functional language better than using an object language with a functional language"?

Disadvantages of PPPP
- N tool chains, languages, libraries, "ecosystems"
- impedance mismatch between tools

Advantages of PPP
- use best tool for a particular job
- can minimize amount of code required
- can keep code closer to the domain
- encourages thinking about architectures

Why go mainstream now?
- rapidly increasing pace of development
- pervasive concurrency
- cross-cutting concerns

2008-11-19 Wednesday - QCon 2008 Day-1 morning

Disclaimer: There may be some many mistakes/errors in my blog notes for QCon over the next few days - as I am writing these posts while I'm sitting in sessions - and much of these notes are stream-of-consciousness - as I try to keep up with the presenters.




I'm in San Francisco attending QCon 2008 this week.

Martin Fowler gave a good keynote (with Rebecca Parsons) this morning - which I arrived 1/2 way through due to a slight flight arrival delay.

Ruby

10:15AM...Gregg Pollack kicked-off the Ruby track

11AM...I'm sitting in a Ruby presentation that is currently covering Merb
Presenter: Matt Aimonetti, he maintains blogs at:

merbist.com
railsontherun.com

One interesting point made during this morning's Ruby presentations: Ruby's historical bad performance reputation may not be currently valid given certain performance improvements (e.g. Merb compared to raw PHP, leading PHP frameworks, Django, Rails, Code Igniter, etc.)

Merb has three architecture layers (extension points ?):
plugins
slices
API

Merb's adaptability allows replacing core functionality with custom implementations

Benefits of Merb for developing/deploying Ruby applications:
- scalability
- performance
- modularity
- small memory footprint


Take-aways:
- Ruby is not slow
- Merb is flexible
- Merb is modular
- Merb is very scalable

- Yellowpages.com is using Merb for some backend components

- Wikipedia is investigating possible usage of Merb

- Matz (developer of Ruby) "Merb has a bright future...[thinks] will give users more freedom in a Ruby-ish way of programming"

Monday, November 17, 2008

2008-11-17 Monday

SEI publishes report integrating CMMI and Agile

The Open Source Enterprise: Its Time Has Come

I will be in San Francisco Wednesday thru Friday attending QCon 2008. The schedule is an incredible jam-packed 3-day fest for enterprise architects.

Frank Kenney has a funny, but poignant, blog post over on the Gartner site: Ahh Shucks, SOA Is A Failure

Sunday, November 16, 2008

2008-11-16 Sunday

My presentation at Seattle Code Camp v4.0 (held in Redmond, WA at the Digipen facility) went well yesterday. The topic was well received - and there was plenty of discussion amongst the attendees. During my previous presentation at Code Camp v3.0 I gave away some nice prizes - the best of which was a 320GB external USB drive. Yesterday I raised the bar - and gave away a 500GB external USB drive. My next presentation door prize will include a 1TB external USB drive.


Tonight I downloaded Drupal-6.6 - and must say it looks very very slick.
Drupal won Packt’s annual Open Source Content Management System (CMS) Award



I happened to come across a new blog tonight: SOA Probe, by Robert Morschel, Chief Architect at Neptune Software. He's written some very interesting posts - as well as some articles that have been published on SOA World Magazine: Is SOA Non-Trivial?

Another very useful blog I came across tonight: Sara Ford's Visual Studio 2008 Tip of the Day - as well as JasonHailey.com

This coming week is an important SOA event:
14th International SOA World Conference & Expo 2008 West will take place on November 19-21, 2008 at The Fairmont Hotel in San Jose, CA.

DeviceGuru.com - 16 interviews with Linux Kernel hackers

Friday, November 14, 2008

2008-11-14 Friday - Continuous Integration with Hudson, Subversion, and Glassfish

I'm giving a presentation at this year's Seattle Code Camp, 4.0 on November 15th, 9AM, Saturday morning in Redmond, Washington on the topic of Continuous Integration (with Hudson, Subversion, and Glassfish). In preparation for giving my talk, I've assembled a collection of links to resources that may be of interest to others:


Continuous Integration

Martin Fowler's CI paper

ThoughtWorks CI Feature Matrix



Hudson

Hudson at https://hudson.dev.java.net/

Meet Hudson

Hudson Best Practices

Getting started with Hudson

Writing a Hudson plugin (Part 6 - Parsing the results)

...more tutorials on developing Hudson plugins

Improving the Engineering Process Through Automation by Hudson

Continuous Integration and Code Inspection with Hudson and FindBugs

Rama Pulavarthi's Blog: Hudson@JavaOne 2008

Java Power Tools

The Java Power Tools Bootcamp courses

Eric Lefevre-Ardant on Java & Agile

Installing Hudson,Ssh,Svn,Trac,Tomcat,Maven,Ant from Scratch on my Ubuntu 8.04 Machine

Hudson - Tips and Tricks

Handy Dandy Hudson trick

Carlo Bonamic's presentation: Continuous Integration With Hudson

Hudson embraces Python
Using Hudson as a Continuous Integration tool for Python

Installing Hudson,Ssh,Svn,Trac,Tomcat,Maven,Ant from Scratch on a Ubuntu 8.04 Machine

Matthew McEachen: Getting started with Hudson

[Howto] Setting up a Continuous Integration Server for Grails with Hudson on VMWare

Hudson Gant plugin

A system tray icon for monitoring Hudson with Eclipse RCP

Running Gant builds in Hudson

testearly.com - Hudson’s so Groovy

testearly.com - CI with Hudson tutorial

Versioning a Hudson job configuration



Some alternatives to Hudson...

Apache Continuum

Atlassian Bamboo

CruiseControl

Want more alternatives?




Subversion

Subversion

TortoiseSVN

webSVN

statSVN

AnkhSVN

VisualSVN




Glassfish

Glassfish at
https://glassfish.dev.java.net/







Other Continuous Integration / Build Related Resources

IBM Developerworks: Technical library view: Code Quality series

IBM Developerworks: Forum: Improve Your Java Code Quality

IBM Developerworks: Spot defects early with Continuous Integration

What's Wrong with Build Systems in Java Today?

Interview: John Ferguson Smart, Author of Java Power Tools

IBM Developerworks: Distributed compilation

One build platform to rule them all?

CruiseControl Best Practices: not just for java

fusemetrics - Build Metrics Dashboard





2009-01-31 Updates

I've come across a few interesting links recently:

Sonatype blog, John Casey: The Hudson Build Farm Experience, Volume I

Git Integration with Hudson and Trac

My friend Nicholas Whitehead's JavaWorld article: Continuous integration with Hudson
(examples are given for Windows XP with Tomcat 6 or Ubuntu Linux with JBoss AS)


Kohsuke Kawaguchi's Blog: Hudson usage analysis




2009-02-02 Monday Updates

Automatically deploy to Glassfish using Hudson





2009-02-16 Monday UPdates

Guide to building .NET projects using Hudson

Monitoring External Jobs

Distributed Builds

Fingerprint

Securing Hudson

Remote Access API

2009-05-16 Saturday Update:
Continuous Release and Upgrade of Component-Based Software
Tijs van der Storm
Centrum voor Wiskunde en Informatica (CWI)
Amsterdam, The Netherlands

2009-06-27 Saturday Update:
Ryan de Laplante: Creating a Windows service for Glassfish V2

Sunday, November 09, 2008

2008-11-09 Sunday - SOA Governance Example Processes

SOA governance: Examples of service life cycle management processes

2008-11-09 Sunday - SOA Faults and Exceptions

Faults and exceptions in JAX-WS

2008-11-09 Sunday - Automated Deployment

I've been thinking about how to simplify the automation of deploying (and potentially rolling back) some of the infrastructure configuration files, components, business rules, etc. of a Service Oriented Architecture.

This is particularly challenging if you are seeking to leverage a substantial collection of heterogenous Open Source tools.

Capistrano was mentioned by Martin Fowler in his paper on Continous Integration.

http://capify.org/
Capistrano is a tool for automating tasks on one or more remote servers. It executes commands in parallel on all targeted machines, and provides a mechanism for rolling back changes across multiple machines. It is ideal for anyone doing any kind of system administration, either professionally or incidentally.



Websphere: Automated Deployment of Enterprise Application Updates, Part 1 - Basic concepts

TFS Deployer interfaces the Team Foundation Server build store with PowerShell allowing script execution to be triggered across multiple machines when a user transitions the quality indicator of a build from one state to another.

Robocopy
A command-line directory replication command. It was available as part of the Windows Resource Kit, and introduced as a standard feature of Windows Vista and Windows Server 2008.


The Windows Installer XML (WiX) is a toolset that builds Windows installation packages from XML source code. The toolset supports a command line environment that developers may integrate into their build processes to build MSI and MSM setup packages.

Ayende Rahien: Requirements 101: Have an automated deployment

IBM Developerworks: Automatic deployment toolkit for an SOA project environment, Part 4: The automatic Build-Deploy-BVT toolkit for SOA projects

Friday, November 07, 2008

2008-11-07 Friday - OpenOffice

IBM, Sun Microsystems Launch ODF Toolkit Union To Grow Adoption, Community and Software Innovation

The OpenOffice.org ODF Toolkit Project

I have been using OpenOffice 2.4 for the last 3-4 months - as well as Mozilla Thunderbird (for email) - and I have to say I am __very, very, very__ pleased with both so far.

2008-11-09 Update
I've upgraded my OpenOffice install to verion 3.0

Sunday, November 02, 2008

2008-11-02 Sunday - Misc. Links

Google introduces service-level guarantee for its Apps suite
The Premier Edition of the Google Apps online productivity and collaboration suite will come with a 99.9 percent per-month uptime guarantee for the Gmail, Calendar, Docs, Sites and Google Talk services.

In effect, Google is promising compensation if downtime exceeds around 45 minutes a month -- but it won't count outages of less than 10 minutes' duration towards this total. The SLA conditions define downtime as when the "user error rate" exceeds 5 percent, as measured on the server side. Customers could therefore experience far more than 45 minutes of downtime, in shorter bursts, without compensation.

2008-11-02 Sunday - Fail Fast

I mentioned in a previous blog post this weekend "Fail Fast, Learn Early". After writing that I thought, I wonder what others have said before about "Fail Fast"?

Here are a few of the interesting links I found:

Kate Gregory: Fail Fast. I spent some time reading a bit more of Kate's other postings - and highly recommend her blog as another one to visit often.

Jim Shore (of Martin Fowler's ThoughWworks): Fail Fast. A short pdf on the technical aspects of Java code exceptions - but still pertinent to the idea of failing fast in the more general business / management context - and a great technical article.

Redefining Business: Ready, Set, Fail
Speed and innovation compel businesses to build a culture of positive failure
By Chris Murphy and Diane Rezendes Khirallah

Joshua Porter: The Freedom of Fast Iterations: How Netflix Designs a Winning Web Site

blog.anamazingmind.com Learning Mastery 3 - Fail Early, Fail Often

Saturday, November 01, 2008

2008-11-01 Saturday - Misc.

Is LINQ to SQL Truely Dead?

Microsoft Enterprise Library 4.1 – October 2008

Brad Abrams: AjaxWorld Talk: Building Rich Internet Applications Using Microsoft Silverlight 2. Impressive!

Brad Abrams: Framework Design Guidelines slides from PDC2008 Talk

Brad Abrams: .NET 3.5 and .NET 4 Poster

2008-11-01 Saturday - BizTalk 2006 R2 Performance

I'm currently assisting a client with researching BizTalk 2006 R2 performance tuning, here are some of the resources I've found that may be of use to others:

MSDN Resources
Troubleshooting BizTalk Server Performance

Identifying Performance Bottlenecks

Configuration Parameters that Affect Adapter Performance - (you __SERIOUSLY__ want to read this link)

Troubleshooting BizTalk Server Adapters

How to Diagnose Problems with the MSMQT Adapter

Troubleshooting BizTalk Server Dependencies

Orchestration Engine Configuration


blogs.msdn.com/biztalkperformance/

blogs.msdn.com/biztalk_server_team_blog/

BizTalk Server 2006 Comparative Adapter Study

BizTalk Server 2006 R2 Technical Documentation Library

BizTalk Server 2006 Troubleshooting Guide

Microsoft BizTalk Server Performance Optimization Guide

MSDN: Microsoft BizTalk Server Performance Optimization Guide

Microsoft BizTalk Server Performance Optimization Guide (download)

Known Issues with the MSMQ Adapter

Known Issues with the SOAP Adapter

Managing a Successful Performance Lab

BizTalk 2006 R2 University - PPT's, HOLs, & Demos from the Free Tech*Ed ]inbetween[ event

BizTalk Best Practice Analyzer v1.1 RTM
"The BizTalk Server 2006 Best Practices Analyzer performs configuration-level verification by reading and reporting only. The Best Practices Analyzer gathers data from different information sources, such as Windows Management Instrumentation (WMI) classes, SQL Server databases, and registry entries. The Best Practices Analyzer uses the data to evaluate the deployment configuration. The Best Practices Analyzer does not modify any system settings, and is not a self-tuning tool."


Microsoft BizTalk LoadGen 2007 Tool

ESB Guidance Architects User Group Presentation PPT

CodePlex: BizUnit - Framework for Automated Testing of Distributed Systems

The following article refers to BizTalk 2004, BizTalk Server Performance Tuning:
http://geekswithblogs.net/abhijeet/archive/2006/03/29/73667.aspx


Zeeshan’s Integration Bits: Performance tuning with BizTalk 2006 - (another __MUST__ read)

BizTalk Architecture, High Availability and MSMQ Adapters


Muhammed Ismail's Blog: MSMQ vs. MSMQT
"...there are some disadvantages to MSMQT as well, these include:

Poor performance, MSMQT is single threaded. In high volume scenarios, this could cause messages to be processed quite slowly.

  • Not all MSMQ features are implemented in MSMQT (remember that MSMQT emulates MSMQ, so it's not identical). In particular most of the MSMQ v3.0 features (such as MSMQ over HTTP) are not available

  • No published APIs for programtic use

  • MSMQT is only used by BizTalk. MSMQ on a BizTalk computer could also be shared with other applications.


  • Pradeep's WebLog: Difference between MSMQ, Biztalk Server and SQL Server 2005 broker service

    BizTalkMsgBoxDb Lock/Waits --> Critical performance slow down

    Udi Dahan: Scaling Long Running Web Services

    Udi Dahan on scalability

    Ewan Fairweather: BizTalk Performance - Useful technique to baseline your infrastructure

    Tom Hollander: Building a Pub/Sub Message Bus with WCF and MSMQ




    AVIcode BizTalk 2006 Application Management Pack

    Monitoring for security and connectivity problems related to incorrect configuration of BizTalk Adapters for HTTP, MSMQ, SMTP, SOAP, SQL Server, Windows SharePoint Services, and Oracle.

    Real-time detection of BizTalk Application failures due to erroneous .NET code activities within BizTalk Orchestrations and Pipelines, with identification of offending line of code.

    Real-time detection of BizTalk Application failures due to connectivity problems to MSSQL, Oracle, and DB2 databases within BizTalk Orchestrations.

    Real-time detection of BizTalk Application failures due to connectivity problems to third party Web Services within BizTalk Orchestrations.

    Tracing and Performance Analysis of MSSQL, Oracle, and DB2 databases usage within BizTalk Orchestrations.

    Assistance in debugging data processing errors inside XSLT transformations.
    Detection and Correlation of BizTalk resource utilization for Memory, CPU and I/O with overall Windows Server resource utilization for Memory, CPU and I/O at the moment a problem occurs.

    Trending analysis for Memory, CPU and I/O resource utilization for BizTalk Server


    2008-11-13 Update:
    Pete Klein (Director of Connected Systems )with Neudesic sent me a message today suggesting another resource link:
    BizTalk Server Database Optimization.

    Pete is a great resource to call on for BizTalk implementation help.

    2008-11-01 Saturday - SOA Challenges, SOA Pain

    Martin Fowler once posted a great piece about Service Oriented Ambiguity.

    I find that the most common challenge for introducing Service Oriented Architecture (SOA) into an organization is not about the technologies used (although there are challenges there to be sure).

    No, the real challenges are people-related: existing culture, power fiefdoms, resistance-to-change, squeezing necessary training into work schedules, ensuring a common vocabulary is established and adopted, achieving the broad dissemination of the conceptual understanding of the core principles of a services based architecture (the benefits as well as the costs), vendor sales teams that over-promise and all too often under-deliver, media pundits that over-hype the concept of SOA, Architecture Astronauts, resource staffing constraints, the reality of urgent business priorities that frequently require the reallocation of team members, maintaining executive sponsorship commitment over the long period necessary to build momentum and achieve demonstratable ROI, and managing expectations (no, SOA will probably not be faster than your hard-coded legacy application).

    Some of these are manifestations of funding constraints, but most of this is simply the pain of introducing change into an organization.

    So what can you do?

    Hitch Your Wagon to the Value Train: Focus on delivering business benefits that have a demonstrated ROI - and that can be performed in iterations that are time-boxed (say 1-3 months, but no more than 4-6 months).

    Fail Fast, Learn Early: Mitigate high-level risks as soon as possible. Identify potential performance issues before you have spent any signficant portion of your budget on hardware and vendor software. Get your people to put their hands on the technology you'll be using sooner - and find out what works, what doesn't, and where you need additional training.