Tuesday, August 27, 2019

2019-08-27 Tuesday - Sparx EA v15 Demo 001 Filtering Diagram Elements with Tagged Property-Value

This is my first demo video for Sparx EA v15 Demo (#001 -  001 Filtering Diagram Elements with Tagged Property-Value)


I've created a playlist on my YouTube channel to organize additional Sparx EA v15 demos I plan to record in the future.
Sparx EA v15 Demos Playlist

Sunday, August 25, 2019

2019-08-25 Sunday - Why You May (or, May Not) Wish to Retain My Services

Photo by Sarah Dorweiler on Unsplash
source https://unsplash.com/photos/x2Tmfd1-SgA

You should know: If you prefer to keep things the way they are - you may not wish to engage my services :)

1) I will challenge perspectives - and assumptions. Sometimes, this may make some folks uncomfortable.

2) I will help identify gaps and inefficiencies - and make suggestions for improvements. Sometimes, if needed, pointedly. Although, I will always strive to be considerate and diplomatic. Some folks may find change difficult to digest.

3) I will challenge the status quo. At times, this may be discomforting.

4) I will not turn away from looking under rocks. Some folks may not want this.

5) I will actively work to help change things for the better. Some folks may actively resist this.

6) I insist on dealing from the top of the deck - with everyone (board members, executives, employees, contractors, vendors, partners, customers). Some folks may not be aligned with this.

7) I will tell you what you need to hear, not what you want to hear. I am fairly certain - this may not always be well received by some folks.

8) I  will view any project expenditures I may propose - as if it were my own money being spent. You should know, I am a frugal person. I abhor waste. A $1 investment needs to produce a return (however, that return may be expressed in several possible dimensions - not all of which are strictly/directly monetary).

9) I am deeply interested in considering 2nd and 3rd order effects of decisions - considering the enterprise as a patient - that should be diagnosed - and treated - holistically. My perspective can be a positive balancing force with ready-fire-aim cultures - but, it may also create discomfort for some folks.

10) I am very detail oriented. This expresses itself in my work as being thorough. This may be uncomfortable for folks that may not wish to have their areas examined too closely - or who have little interest in root-cause determination.

Sunday, August 18, 2019

2019-08-18 Sunday - Git Repository Naming Conventions Research

Photo by Jeremy Thomas on Unsplash
source: https://unsplash.com/photos/FO7bKvgETgQ


I've started drafting a proposed guideline document (consider it as a possible baseline from which to custom your own "Git Repository Naming Conventions" standard). This will be published under an Open Source MIT License.

As part of my research, I've reviewed many articles, discussion threads, and have examined the naming conventions used (explicit, implicit, or derived/implied) for repositories in dozens of well known, high profile, Open Source, github accounts.

If you might be interested in receiving a link to review the preliminary draft - just leave a comment, and I'll send you a private message with the link (targeting by early September).

If you have strong opinions, ideas to contribute - please contact me privately.

I welcome suggestions, review feedback, and additional input - and will be happy to include citing your co-authorship. 

If you know of a non-trivial, publicly published, standard/convention - please share that as well in a comment below.

Friday, August 16, 2019

2019-08-15 Friday - Sparx EA v15, now available

Photo by Cesar Carlevarino Aragon on Unsplash
Source: https://unsplash.com/photos/NL_DF0Klepc



I noted that Sparx EA v15 was officially released on July 27th, 2019


Some particular improvements that I noted after doing a quick read of the Release Notes:
  • Diagrams
    • AWS Architecture and Google Cloud Platform technologies updated to improve behavior when dropping an image with the their toolboxes active
    • Amazon/AWS and Google/GCP toolboxes now check for imported image library during use
    • Amazon/AWS patterns updated
    • Added support for web style back navigation when following a hyperlink on a diagram
  • Source Directory and Visual Studio Solution Import Updated
    • Improved usability for detecting and adding language macros
    • Added a 'Dry Run' option to quickly scan for potential issues
    • Added an option to compare timestamps of files being imported before performing the import (Currently C++ only)
    • Added Package per File option to Solution import
  • Code Engineering
    • Source Code Directory Import 'Package per File' improved
    • Visual Studio Project Import now supports importing as Package per File

  • Other Changes
    • Error handling for bulk updates over the cloud improved
    • Modeled Add-Ins can now be loaded in eap files when JET 4 is disabled
  • User Interface
    • Element Browser now shows a context menu with Navigation Options for root item
    • Traceability window handling for elements with very large numbers of relationships improved
  • Dynamic Model Add-Ins
    • All behavioral code is written in javascript
    • Add-ins can access all Repository based behavior
    • Add-ins can respond to repository events (signals)
    • Add-ins can set up and use property lists
    • Add-ins can call SBPI based API's
    • All code is in Javascript
    • Defined add-ins can be published to XMI or deployed to a RAS service to allow use across multiple models
    • Add-ins can now return "Workflow" in EA_Connect to opt-in to workflow events
    • Can model mail users when state changes
  • Javascript engine updated
    • Built-in javascript support updated to use the Mozilla Spidermonkey 63
    • Provides new functions such as JSON parsing
  • Simple Drawing Style
    • Introducing a new diagram drawing style that will make it easy to draw flat and simplistic diagrams
    • Similar to Visio style drawings
  • Diagram Alternate Views
    • Specification - A document style view of the elements on the current diagram. Focused on the name and the notes of elements\
  • Diagram Matrix View
    • This connector focused view provides a view of how elements on the current diagram are related
    • Provides a relationship matrix view for the elements that appear on a diagram
    • Drawn in a style similar to the state table view
    • Uses existing quicklinker rules to determine which connectors can be created
    • Includes the option to limit the display to those elements that have relationships defined
  • Model Documents XML export
    • Model documents can now be exported to XML that allows importing all linked packages into another model
    • Allows you to easily define a model subset that can be included in a restricted WebEA view

Thursday, August 08, 2019

2019-08-08 Thursday - The Coq Language


https://en.wikipedia.org/wiki/Coq

"Coq is an interactive theorem prover. It allows the expression of mathematical assertions, mechanically checks proofs of these assertions, helps to find formal proofs, and extracts a certified program from the constructive proof of its formal specification. Coq works within the theory of the calculus of inductive constructions, a derivative of the calculus of constructions. Coq is not an automated theorem prover but includes automatic theorem proving tactics and various decision procedures."
"The Association for Computing Machinery rewarded Thierry Coquand, Gérard Pierre Huet, Christine Paulin-Mohring, Bruno Barras, Jean-Christophe Filliâtre, Hugo Herbelin, Chetan Murthy, Yves Bertot and Pierre Castéran with the 2013 ACM Software System Award for Coq."
"The word coq means "rooster" in French, and stems from a local tradition of naming French research development tools with animal names.[4] Up to 1991, Coquand was implementing a language called the Calculus of Constructions and it was simply called CoC at this time. In 1991, a new implementation based on the extended Calculus of Inductive Constructions was started and the name changed from CoC to Coq, also an indirect reference to Thierry Coquand who developed the Calculus of Constructions along with Gérard Pierre Huet and the Calculus of Inductive Constructions along with Christine Paulin-Mohring." 
https://coq.inria.fr/
"Coq is a formal proof management system. It provides a formal language to write mathematical definitions, executable algorithms and theorems together with an environment for semi-interactive development of machine-checked proofs. Typical applications include the certification of properties of programming languages (e.g. the CompCert compiler certification project, or the Bedrock verified low-level programming library), the formalization of mathematics (e.g. the full formalization of the Feit-Thompson theorem or homotopy type theory) and teaching."



Github Resources:


https://github.com/coq


A port of Coq to Javascript 

Tutorial Resources:

Sandboxes:


Note:
My awareness and interest in Coq was spurred by this AI podcast interview that Lex Fridman conducted with George Holtz (re: Comma.ai, OpenPilot, and Autonomous Vehicles)
(Bloomberg: "The First Person to Hack the iPhone Built a Self-Driving Car. In His Garage")

Wednesday, August 07, 2019

2019-08-07 Wednesday - Microservices Saga Pattern


Photo by Eduardo Flores on Unsplash
https://unsplash.com/photos/qJ4FCI0sx98


In Madhuka Udantha's recent DZone article, Design Patterns for Microservices,  I noted he placed the Saga Pattern in a Database Pattern collection.

I wonder if this is perhaps because he is defining it in a strict alignment with the original 1987 paper by Hector Garcaa-Molrna Kenneth Salem?

It seems misplaced though there, to me...given the utility of Saga Pattern for orchestration, choreography, or database coordination...

Typically, I think of the Saga Pattern in terms of a type of Integration Pattern...of which orchestration or choreography could be just subtypes...if not outright peer level collections to the Database level collection that he defines...

Since the Saga Pattern may be used to manage the loose orchestration with other services (either internal, or external) - but may not be dealing directly with a database at all - the placement in Database collection seems out of place...?


Background Reading:

See Arnon Rotem-Gal-Oz's 2012-2013 blog postings:


Chris Richardson's article, from earlier this year - which aligns more closely with  Madhuka's name grouping...

Distributed Sagas for Microservices, by  (2018)

Saga: How to implement complex business transactions without two phase commit, by Bernd Rücker (2018)

You may also find these patterns in Thomas Erl's site (re: Arcitura Education Inc) of interest:

2019-08-07 Wednesday - Defect Density (DD) Metrics

Photo by Aswin Anand on Unsplash
https://unsplash.com/photos/0Hmh461Goog


A friend recently inquired if I might know of any published stats or papers on Defect Density (DD) Metrics - that might provide some guidance on some kind of an industry average for expected number of defects per 1K Lines of Code (LOC).

I think that that is a fairly hard answer to obtain - and may very well vary greatly depending on a number of factors:
  • Business / Industry (e.g. NASA, Emergency Medicine Support Systems, Military-grade Avionics and Flight Control Systems, etc. vs. Social Media applications)
  • Level of expertise of development team members (not necessarily years of experience)
  • Programming Language (although, this is a weaker indicator/correlation factor)
  • Size of the code base, in LOC
  • Size of the team
  • Number of Classes/Modules
  • Complexity of the application/problem domain
  • Level of Regulatory Compliance for the business/problem domain
There are also other considerations to obtaining a relatively meaningful/accurate Defect Density average: 
  • Accounting for Averages skewing based on the level of clustering in Defect Severity Levels - for a given code base
  • Accounting for Averages skewing based on level of Code Duplication - for a given code base

There are two other, second-order (and more easily obtained), metrics - which might be bettered used as a first-approximation (or proxy) predictor of expected Defect Density:



There is excellent Open Source tooling/reporting available for determining/monitoring those two - that is easily integrated into Continuous Integration build pipelines - and from which automated alerts can be configured, based on exceeding defined tolerance/threshold levels.

Some Suggested Background Reading, re: Defect Density:



  • v-SVR Polynomial Kernel for Predicting the Defect Density in New Software Projects, 6 pages, accepted at Special Session: ML for Predictive Models in Eng. Applications at the 17th IEEE International Conference on Machine Learning and Applications, 17th IEEE ICMLA 2018
    • "An important product measure to determine the effectiveness of software processes is the defect density (DD). In this study, we propose the application of support vector regression (SVR) to predict the DD of new software projects obtained from the International Software Benchmarking Standards Group (ISBSG) Release 2018 data set. Two types of SVR (e-SVR and v-SVR) were applied to train and test these projects. Each SVR used four types of kernels. The prediction accuracy of each SVR was compared to that of a statistical regression (i.e., a simple linear regression, SLR). Statistical significance test showed that v-SVR with polynomial kernel was better than that of SLR when new software projects were developed on mainframes and coded in programming languages of third generation"
      • "Verma and Kumar[14] use simple and multiple linear regression models topredict the DD of 62 open source software projects. They conclude that there isstatistically significant level of acceptance for DD prediction using few repository metrics individually and jointly"
      • "Yadav and Yadav [15] apply a fuzzy logic model for predicting DD at each phase of development life cycle of 20 software projects from the top most reliability relevant metrics of each phase. They conclude that the predicted DD are found very near to the actual defects detected during testing."
      • "Mandhan et al. [16] predict DD by using simple and multiple regression modelsgenerated from seven different software static metrics(i.e., coupling, depth, cohesion, response, weighted methods, comments,and lines of code). They conclude that there is a significant level of acceptance for DD prediction with these static metrics individually and jointly."
      • "Rahmani and Khazanchi [11]applysimple and multiple regression models to predict DD of 44 open source software projects. They conclude that there isa statistically significant relationship between DD and number of developers and software size jointly"
      • "Knab et al. [18]use a decision tree model to predict DD of seven releases of aopen source web browser project.They conclude that (1) it is feasible to predict DD with acceptable accuracies with metrics from the same release, (2) to use lines of code has little predictive power with regard to DD, (3) size metrics such as number of functions are of little value for predicting DD, (4) it is feasible predict DD with satisfactory accuracy by using evolution data such as the number of modification reports, and that (5) change couplings are of little value for the prediction of DD"
      • "The new software projects used in our study were obtained from the public ISBSG data set Release 2018. This release contains 8,261 projects developed between the years 1989 and 2016. The data of these projects were submitted to the ISBSG from 26 different countries [31]"
      • "Regarding limitations of our study, although the last version of the ISBSG release 2018 consists of 2,557 new software projects of the total (8,261 projects), after we followed the criteria suggested by the ISBSG for selecting the data sets for new software projects, we could only use data set of 2projects to train and test the models." 
      • "DD is defined as the number of defects by 1000 functional size units of delivered software in the first month of use of the software. It is expressed as defects by 1000 function points"








  • PREDICTION OF DEFECT DENSITY FOR OPEN SOURCE SOFTWARE USING REPOSITORY METRICS, Journal of Web Engineering, Vol. 16, No.3&4 (2017) 293-310

    • "In this work, a relationship of defect density with different repository metrics of open source software has  been  established  with  the  significance  level.  Five  repository  metrics  namely  Size  of  project, Number of defects, Number of developers, Number of downloads, and the Number of commits have been identified for predicting the defect density of open source project. This relationship can be used to predict the defect density of open source software. An analysis has been performed on 62 open source software available at sourceforge.net. Simple and multiple linear regression statistical methods have been used for analysis. The result reveals a statistically significant level of acceptance for prediction of defect density by some repository metrics individually and jointly"




    1. "As part of a Department of Homeland Security (DHS)federally-funded analysis, Coverity established a new baseline for security and quality in open source software based on sophisticated scans of 17.5 million lines of source code using the latest research from Stanford University’s Computer Science department. The LAMP stack — popular open source packages Linux, Apache,MySQL, and Perl/PHP/Python — showed significantly better software security and quality above the base line with 0.290 defects per thousand lines of code compared to an average of 0.434 for 32 open source software projects analyzed"


  • https://scan.coverity.com/projects/
    • Review projects that others have submitted - that have been scanned by Coverity - and note their DD values...
    • You can also register a Github project for a scan...



    1. http://amartester.blogspot.com/2007/04/bugs-per-lines-of-code.html
    2. Code Complete: A Practical Handbook of Software Construction, Second Edition 2nd Edition by Steve McConnell
(a) Industry Average: "about 15 - 50 errors per 1000 lines of delivered code."
(b) Microsoft Applications: "about 10 - 20 defects per 1000 lines of code during in-house testing, and 0.5 defect per KLOC (KLOC IS CALLED AS 1000 lines of code) in released product (Moore 1992)."
(c) "Harlan Mills pioneered 'cleanroom development', a technique that has been able to achieve rates as low as 3 defects per 1000 lines of code during in-house testing and 0.1 defect per 1000 lines of code in released product (Cobb and Mills 1990). A few projects - for example, the space-shuttle software - have achieved a level of 0 defects in 500,000 lines of code using a system of format development methods, peer reviews, and statistical testing."


    1. "This observation is very old, and comes from a very venerable source, namely Fred Brooks in his book "The Mythical Man Month". He was a top manager at IBM, and managed many programming projects including the millions-of-lines operating system OS/360. In fact he reported that the number of bugs in a program is not proportional to the length of code, but quadratic! According to his research, the number of bugs was proportional to the length of the program to the power 1.5. In other words, a program that is ten times longer has 30 times more bugs. And he reported that this held over all programming languages, and levels of programming languages."





  • The Personal Software Process, Experiences from Denmark,  Proceedings. 28th Euromicro Conference (2002)
    • "The focus of the research and practice in software process improvement (SPI) is shifting from traditional large-scale assessment based improvement initiatives to smaller sized, tailored initiatives where the emphasis is on the development personnel and their personal abilities. Personal software process (PSP/sup SM/) is a method designed for improving the personal capabilities of the individual software engineer. This paper contributes to the body of knowledge within this area by reporting experiences from Denmark. The findings indicate an improvement in effort estimation skills and an increase in the resulting product quality in terms of reduced total defect density. The data shows that even with a relatively small effort (i.e., 10%) used in defect prevention activities (i.e., design and code reviews) almost one third of all defects could be removed and, consequently, the time required for the testing was reduced by 50%. On the basis of this data, the use of the PSP method in the software industry is discussed"

    1. "Software failure is becoming a serious issue. Ariane 5 provided a recent spectacular example of how a simple mistake, entirely avoidable, was allowed to sneak through the software verification stage and cause an immensely expensive failure. However, it is not just the aerospace industry which suffers such traumas. Here, the author discusses some common misconceptions."


2019-11-16 Saturday Addendum:

Today, I learned about the Stella Report - and recommend it as additional reading:
https://snafucatchers.github.io/


2020-09-15 Tuesday Addendum:

A tip of the hat to Pete Jarvis for his post on LinkedIn to this paper

How Do Fixes Become Bugs?
A Comprehensive Characteristic Study on Incorrect Fixes inCommercial andOpen Source Operating Systems


Friday, August 02, 2019

2019-08-02 Friday - Suggested Practices for Highly Effective EA Teams


Photo by Wayne Bishop on Unsplash
Source: https://unsplash.com/photos/7YUW7fvIYoQ,


 
Here's a quick start at creating a list of ideas for Suggested Practices for Highly Effective Enterprise Architecture (EA) Teams.

1) Asynchronous Communication: Your  team adopts and uses a web-based team collaboration/communication/chat tool. If your team is relying on email for your primary means of communication - if you are passing documents back and forth via email - you are the canonical example for why I wrote this list. Hint: You should be posting links to your repository-based artifacts.

2) Simplified Asynchronous Collaboration: You use git as your primary collaboration repository / publishing mechanism for EA artifacts - with a web UI (e.g. private Github, Bitbucket, etc.).  A wiki is an excellent choice as a complement for some categories of content (e.g. published for consumption outside of the EA team).

3) Simplified Processes for Asynchronous Authoring/Publishing: You use  markdown, stored in the git repositories, to write the majority of your EA documents.  Also see #2 above, re: Wiki.

4) Asynchronous Governance Processes: Your governance tooling and processes are based on the premise of a geographically distributed team - that operates in an asynchronous manner (i.e. artifacts are published for review, comments are collected, and voting is conducted - completely asynchronously). If the basis of your governance process is that you must interrupt team members ability to stay focused on high value tasks - by insisting on scheduling recurring mandatory governance meetings - on a very frequent basis - you are doing it wrong.  Meetings should be the exception - not the norm. Meetings for governance processes should primarily be called when there has been a lack of consensus in the voting - or there are significant questions/discussions that cannot be serviced via a discussion thread within a private (and obviously, secure) discussion forum tool.

5) Automated Reminders: Your governance tooling and processes are designed to send out timely reminders for assigned tasks to be completed.

6) Automated Search: You leverage powerful automated search tools (e.g. Apache Solr, Elastic Search, etc.) to make finding artifacts easy and painless.

7) Automated Annotation: You have processes that automate the majority of the effort to annotate, tag, and index the entire corpus of all the artifacts in your EA artifact repository.

8) Diagramming (Elements: - Root Definition/Reuse) : Whatever EA diagramming solution you adopt - supports a core capability of managing a master reference inventory of element definitions; reusing those definitions in different diagrams; easily visualizing the AS-IS/transition/TO-BE views.  (hint: As a baseline example for this type of capability, look at the Diagram Filter capability of Sparx Enterprise Architect. YouTube demonstration video)

9) Diagramming (Element Relationships/Connectivity): Creating relationships between elements; and being able to quickly and easily explore, discover, query, reuse, and report the elements in the master inventory - across different diagrams. (hint: LucidChart, Gliffy, SmartDraw, Creately, Archi, Google Draw, LibreOffice Draw, PlantUML, Umbrello, and Visio are not such solutions). Automated Dependency Impact Analysis is thus possible.

10) Asynchronous (Diagram) Repository Collaboration: EA team members are able to collaboratively work together, asynchronously, in the same repository - while crafting diagrams, components, etc.

11) A Culture of Cultivating EA Artifact Reuse: There is a process defined, resources are staffed (rotated assignment among EA team members is suggested) and effort is allocated - to continually support the creation, harvesting, management, and refresh of reusable artifacts, exemplars, patterns, templates, white papers, technology position papers, etc. - to help accelerate/optimize the efforts of the team.

12) EA Kaizen: You conduct frequent retrospectives to review WHAT you do, HOW you do it - and analyze your own EA processes for improvement. Minimally, this should be done at least quarterly. This means EA should have a BACKLOG of improvements to manage.

13) Ruthless Efficiency: The relative cost vs. value of governance processes are rigorously challenged - before adoption, and are reviewed periodically for adjustment - or elimination.

14) Secure Asynchronous External Collaboration: You leverage cloud-based, encrypted-at-rest, file storage mechanisms for collaboration with external partners (e.g. Box, DropBox, Google Drive, even private Github repositories, etc. - GPG encrypted files, if/when needed/warranted)

15) Automated Generation/Update of an EA Dashboard: You need to tell a story to your peers and stakeholders. A dashboard is a good starting point. But, you cannot really afford the luxury to assign vital resources to manually assemble/update such a dashboard. So, yours must be automated. Some ideas for possible metrics to collect (automatically), for a selected look-back period (e.g. Last Week, Last Month, Last Quarter, Last Year, vs. ~Current Period): Number of Artifacts Created, Modified; Number of Governance Reviews Scheduled, Completed; Governance Review Outcomes, by Status w/Counts; Diagrams Created, Modified; Diagram Repository Elements/Components Created, Modified; etc. If you have adopted the other recommendations in this list (in particular,  #2, #3, #8, #9, #14) - then you have a solid basis on which to simplify the automation for information collection, analysis, and publishing.

16) Internal URL Shortener:  You use an internal, enterprise-wide URL shortner. This allows you to manage updates/corrections to the final target - without having to edit/update documents everywhere. Bonus Points: A separate batch refresh process to associate a computed hash of the files that URLs that point to - so that you can identify and rationalize/consolidate references to duplicate instances of documents.

17) Daily Journals: Each EA team member publishes a Daily Journal - that is visible to the team.  Wiki or git Markdown files suggested. This does four things for team members:
1) Asynchronously catch-up on status updates - without interrupting conversations, and avoid the  n (n – 1) /2 communication channel servicing problem;  
2) Tribal knowledge is captured;  
3) Reduces the need for team meetings - members can just quickly read/scan each of the members' most recent daily journals for an update.;
4) Supports Business Continuity - in the event someone leaves the team unexpectedly/suddenly.

18) Continuous Knife Sharpening: On a rotating, periodic basis - each EA team member is tasked  with researching, organizing, and giving a one-hour demonstration / technical talk on some new/interesting area of technology, methodology, strategy, practice, etc. Suggested minimal frequency: Monthly. Invited speakers from other internal groups (or vendors, or other companies are also good variations to consider).

19) Awesome Lists: There should be an "Awesome List" git repository - in which team members can record interesting, useful new ideas, resources, articles, open source (or vendor) solutions.  This creates a valuable, persistent knowledge repository for the team - that grows over time. Hint: If you are primarily using emails (or tools like Slack) to communicate such information to the team - you are doing it wrong. The added benefit of this approach is that new team members have immediate access to the historical record of the team's growing body of knowledge (which they won't have - if you continue to just send emails to each other - and Slack sucks for scrolling back in time).

20) Tips Repository: Rationale: See "Awesome List" #19 above. Within this repository are separate Markdown files, with the following suggested naming convention: Tips.{subject area}.md (examples)

21) Automated Knowledge Dissemination: Automated publishing of content for non-IT consumption - from the repositories and automated governance processes - is greatly simplified.  This eliminates a huge cost barrier to making EA artifacts widely available across the enterprise (i.e. automated publishing of static HTML content, or Markdown files - vs. having to pay massive licensing fees for users to access a more complex commercial EA tool/repository).  Sharing is Caring.

Copyright

© 2001-2021 International Technology Ventures, Inc., All Rights Reserved.