2023-12-31

2023-12-31 Sunday - 2023 Reflections

[image credit: Markéta Klimešová (MAKY_OREL) on Pixabay.com]

 

 Stats:

  • [376] commits to my personal knowledge management GitHub repositories

 
 
 
 
  • [2] long-term client engagements completed 
    • Client #1: ~$100B AUM, financial services sector
      • Enterprise Architecture consulting services
    • Client #2: ~$7B annual revenue, financial services sector
      • Enterprise Architecture consulting services
  • [1] Pro Bono consulting engagement, advising a SaaS startup  
  • [19] detailed resume reviews conducted
  • [30+] career coaching/mentoring sessions
  • [44] technology blog posts written 
  • [6,000+] miles traveled 
    • [13] cities visited
  • [5,844] LinkedIn connections (143 pending invitations to connect to review) 
  • [32,659] lines recorded in my 2023 Technology Reading List notes

The highlight of 2023, was this feedback from a startup founder:

"Following your code audit we started asking a ton of questions and ended up firing our tech team.  Thank goodness!  We would never have known if it weren't for you!"
(Key Findings: Hard-coded unauthorized user id and password, as well as unauthorized data exfiltration embedded in the business application source code, and highly sensitive data encryption keys committed into the source code repository. Significant software design and solution architecture concerns identified - related to reliability, performance, scalability, maintenance, cloud infrastructure costs, etc.)

Noteworthy 2023 LinkedIn Engagement
 

 
 


2023-12-23

2023-12-23 Saturday - 10 suggested books for your 2024 personal/professional development goals

 

[image credit: Mohamed_hassan on Pixabay.com]

If you are still working on your 2024 personal/professional development goals, here are some suggested books to consider adding to your reading stack:


Ten books that greatly influenced my professional development:


1. Customers for Life: How to Turn That One-Time Buyer Into a Lifetime Customer

https://www.amazon.com/dp/0385504454/


2. Secrets of Closing the Sale

https://www.amazon.com/dp/0800737903/


3.Father, Son & Co.: My Life at IBM and Beyond

https://www.amazon.com/Father-Son-Co-Life-Beyond/dp/0553380834/


4.As a Man Thinketh

https://www.amazon.com/As-Man-Thinketh-James-Allen/dp/1479409634/


5. Make It Happen Before Lunch: 50 Cut-to-the-Chase Strategies for Getting the Business Results You Want

https://www.amazon.com/Make-Happen-Before-Lunch-Chase/dp/0071360719/


6.How to Win Friends & Influence People

https://www.amazon.com/How-Win-Friends-Influence-People/dp/0671027034/


7. How to Stop Worrying and Start Living

https://www.amazon.com/How-Stop-Worrying-Start-Living/dp/0671035975/


8. Several of Gerald M. Weinberg's books:

https://en.wikipedia.org/wiki/Gerald_Weinberg

https://www.amazon.com/stores/Gerald-M.-Weinberg/author/B00459FFAC

  • An Introduction to General Systems Thinking
  • Becoming a Technical Leader
  • Secrets of Consulting
  • More Secrets of Consulting
  • Exploring Requirements 1: Quality Before Design
  • Exploring Requirements 2: First Steps into Design
  • Are Your Lights On?: How to Figure Out What the Problem Really Is
  • The Psychology of Computer Programming


9. Letters from a Stoic

https://www.amazon.com/Letters-Penguin-Classics-Lucius-Annaeus/dp/0140442103/


10. Meditations

https://www.amazon.com/Meditations-Penguin-Classics-Marcus-Aurelius/dp/0140449337/

2023-12-07

2023-12-07 Thursday - Quantum Computing Conference Talks

 

IBM Quantum Summit 2023

 
 

2023 Mathematical Aspects of Quantum Learning Workshop (UCLA, IPAM)

Institute for Pure & Applied Mathematics  (IPAM)

(22 videos, playlist)
https://www.youtube.com/playlist?list=PLHyI3Fbmv0SckwZK0xfc7itiq9nLWJeUF

https://www.ipam.ucla.edu/programs/workshops/workshop-ii-mathematical-aspects-of-quantum-learning/?tab=overview

https://www.ipam.ucla.edu/programs/workshops/workshop-ii-mathematical-aspects-of-quantum-learning/?tab=schedule


"Recent results have hinted at the role quantum computing and technology may play in the future of machine learning, but much remains to be understood.  For example, it has been shown that quantum computers can offer exponential improvements in learning from quantum data that comes from the physical world, and that compact quantum models can allow us to sample from probability distributions that seem inaccessible to traditional computing devices.  In addition, general purpose quantum algorithms exist to dramatically speed up a number of subroutines that are pivotal in existing machine learning systems, but come with challenging caveats or have led to novel classical algorithm counterparts that challenge the advantage provided by quantum systems.  However, fully grasping these results and connecting them to problems of interest today remains challenging for many reasons."

"In this workshop, we hope to bring together experts from mathematics, quantum algorithms, and machine learning to better understand this intersection and reach the full potential of quantum computing and machine learning.   This includes, but is not limited to, the ways in which quantum computers can accelerate existing machine learning algorithms, how we process inherently quantum data with either classical or quantum computers, and ways in which machine learning can change how we operate quantum devices.  We hope to identify a number of open questions of interest in each area, and draw strong connections to the mathematical foundations of both quantum computing and machine learning."

   

2023-10-22

2023-10-22 Sunday - Today's Meditation: A course plotted - is not the journey

 

[Image by Dorothe (aka Darkmoon_Art) from Pixabay.com]

Today's mediation:

A course plotted on a nautical chart - is not the journey.

Information on a map may be superseded by real-world events - or the map may be based on faulty information. Also, you must always consider that your instruments *may* be reporting incorrect data.

During the preparation for a long voyage, some years ago, I studied the historical voyages of others who had sailed the same areas into which I intended to go.

I studied the currents, the underwater geography, the weather patterns.

I made careful notes of areas of possible refuge and safety in which I might seek shelter from gales that might arise during various segments of my planned voyage.

I recorded the GPS coordinates of areas that were reported by other sailors to have good holding, in which to anchor.

After an exhausting passage of some days, I sought refuge in a wide bay - along a desolate section of the Baja coast, for a good night's rest.

As I navigated to the GPS coordinates recorded from another's previous voyage logs, I became concerned that the recommended location seemed to be in some potentially dangerously shallow water - as the swell of the ocean entered the bay and wrapped around and surged toward the area reported to be safe.

From a distance, the swells were a concern, but were not alarming.

As I drew closer - my apprehension and alarm skyrocketed.

I was entering an area that was clearly very dangerous. As the depth became more shallow, the swells grew in height - and became very large breaking waves.

I immediately swung the wheel 180-degrees and headed toward the middle of the bay, and somewhat deeper water - and dropped my anchor - where I had a peaceful night's rest.

The lessons to be learned:

- A voyage plan is just a plan. You must be agile and adaptable.

- The same goes for business plans, product plans, and project plans.

- From the Rules of Meeks: Rule #1 applies, always.

- If you are doing something that isn't working - don't be rigid in your thinking - be willing to embrace the pivot.

- There are always signs - you must be open to reading them.

- Awareness and adaptability are more powerful than blind optimism.

- Stubborn denial and refusal to accept new information - and insistence on maintaining a course - can result in disaster.



2023-10-20

2023-10-20 Friday - Podcast Idea Experimentation


[image by Tumisu from Pixabay.com]


Episode #2 of the proof-of-concept for a podcast idea completed today.

That episode was not recorded, so will never be broadcast.

Getting the style sorted, finding our rhythm.

Like musicians, we are jamming in the studio - discovering how we can riff and play some tunes. This is practicing - before the performance.

Finding where strengths complement, and exploring our ways of collaborating.

The topics & content covered today, nearly broadcast quality.

As an experiment, after two iterations, I think this might just have legs...

We even have a preliminary name picked for the show...

[image by Tumisu from Pixabay.com]

2023-10-16

2023-10-16 Monday - API Security Educational Resources

YouTube: API Security By Design
https://www.youtube.com/watch?v=acXpD1tRmCQ
 Frank Kilcommins (Principal API Technical Evangelist, SmartBear) and José Haro Peralta (API consultant, author, and founder) [see the link below to
José's  2023 book]

00:00 Intro
03:11 Why API Security matters
04:48 What is API Security
06:22: OWASP Top API 10 Risks
07:04 Broken Object Level Authorization
08:43 Broken Authentication
10:12 Broken Object Property Level Authorization
12:20 Unrestricted Resource Consumption
14:10 Broken Function Level Authorization
16:44 Unrestricted Access to Sensitive Business Flows
19:48 Server-side Request Forgery
22:26 Security Misconfiguration
24:28 Improper Inventory Management
27:11 Unsafe Consumption of APIs
30:08 Authentication vs Authorization
31:03 OAuth Overview
32:24 Authorization Code Flow
34:28 PKCE Flow
35:40 Client Credentials Flow
36:36 Refresh Token Flow
38:35 OpenID Connect
41:00 JSON Web Tokens (JWTs)
44:45 Security-by-design Overview
46:45 Vulnerable API design overview
47:26 Leaking objects
51:34 Integer Identifiers
53:22 Exposing server-side properties in user input
55:07 Flexible schemas with unknown properties
57:37 Summary and Q&A

Suggested Books:

API Security in Action (2020)
https://www.amazon.com/API-Security-Action-Neil-Madden/dp/1617296023/

Microservices Security in Action: Design secure network and API endpoint security for Microservices applications, with examples using Java, Kubernetes, and Istio 1st Edition (2020)
https://www.amazon.com/Microservices-Security-Action-Prabath-Siriwardena/dp/1617295957/

Advanced API Security: OAuth 2.0 and Beyond 2nd ed. Edition (2019)
https://www.amazon.com/Advanced-API-Security-Definitive-Guide/dp/1484220498/

Microservice APIs: Using Python, Flask, FastAPI, OpenAPI and more (2023)
https://www.amazon.com/Microservice-APIs-Jose-Haro-Peralta/dp/1617298417/

OAuth 2 in Action First Edition (2017)
https://www.amazon.com/OAuth-2-Action-Justin-Richer/dp/161729327X/

Secure By Design First Edition (2019)
https://www.amazon.com/Secure-Design-Daniel-Deogun/dp/1617294357

Defending APIs against Cyber Attack: Learn the secrets of defense techniques to build secure application programming interfaces (2024)
https://www.amazon.com/Defending-APIs-against-Cyber-Attack/dp/1804617121


Penetration Testing Tool Resources:

  1. https://github.com/intltechventures/Lab.Security/blob/master/PenetrationTestingTools.md



Other Resources:

  1. https://www.traceable.ai/2023-state-of-api-security
  2. NIST SP 800-95: Guide to Secure Web Services
    1. "The advance of Web services technologies promises to have far-reaching effects on the Internet and enterprise networks. Web services based on the eXtensible Markup Language (XML), SOAP, and related open standards, and deployed in Service Oriented Architectures (SOA) allow data and applications to interact without human intervention through dynamic and ad hoc connections. The security challenges presented by the Web services approach are formidable and unavoidable. Many of the features that make Web services attractive, including greater accessibility of data, dynamic application-to-application connections, and relative autonomy are at odds with traditional security models and controls. Ensuring the security of Web services involves augmenting traditional security mechanisms with security frameworks based on use of authentication, authorization, confidentiality, and integrity mechanisms. This document describes how to implement those security mechanisms in Web services. It also discusses how to make Web services and portal applications robust against the attacks to which they are subject."
  3. NIST SP 800-204: Security Strategies for Microservices-based Application System
    1. "Microservices architecture is increasingly being used to develop application systems since its smaller codebase facilitates faster code development, testing, and deployment as well as optimization of the platform based on the type of microservice, support for independent development teams, and the ability to scale each component independently. Microservices generally communicate with each other using Application Programming Interfaces (APIs), which requires several core features to support complex interactions between a substantial number of components. These core features include authentication and access management, service discovery, secure communication protocols, security monitoring, availability/resiliency improvement techniques (e.g., circuit breakers), load balancing and throttling, integrity assurance techniques during induction of new services, and handling of session persistence. Additionally, the core features could be bundled or packaged into architectural frameworks such as API gateways and service mesh. The purpose of this document is to analyze the multiple implementation options available for each individual core feature and configuration options in architectural frameworks, develop security strategies that counter threats specific to microservices, and enhance the overall security profile of the microservices-based application."
    2. NIST SP 800-204A Building Secure Microservices-based Applications Using Service-Mesh Architecture 
    3.  NIST SP 800-204B Attribute-based Access Control for Microservices-based Applications using a Service Mesh
    4. NIST SP 800-204C Implementation of DevSecOps for a Microservices-based Application with Service Mesh 
  4. https://owasp.org/www-project-api-security/
    1. https://owasp.org/API-Security/editions/2023/en/0x11-t10/
    2. https://content.salt.security/owasp-api-top-10-2023-ebook.html 
    3. https://salt.security/blog/owasp-api-security-top-10-explained
    4. https://snyk.io/learn/owasp-top-10-vulnerabilities/api-security-top-10/
  5. MuleSoft: API security for the digital estate (Top 5 API Security Best Practices)

2023-10-15

2023-10-15 Sunday - Sonatype’s 9th Annual State of the Software Supply Chain

 

[image credit: Sonatype, 9th Annual State of the Software Supply Chain, p-4, with my highlights added]


Noteworthy:
Why the practice of actively managing your Software Bill of Materials (SBOM) is important...

Sonatype’s 9th Annual State of the Software Supply Chain
https://www.sonatype.com/hubfs/9th-Annual-SSSC-Report.pdf

Notable citations:

  • "The rate of download growth in open source consumption has slowed the past two years. In 2023, this trend continued with the average download growth rate sitting at 33%, which is exactly what it was last year. This is a stark comparison to the all-time high of 2021, which saw 73% year-over-year growth"
  • "Between 2022 and 2023, the number of available open source projects grew an average of 29%"
  •  "Maven and npm, are each estimated to reach over a trillion requests in 2023"
  • "[Maven and npm] represent 90% of the request served"
 

[image credit: Sonatype, 9th Annual State of the Software Supply Chain, p-9]


2023-10-11

2023-10-11 Wednesday - Today's mediation: "A" vs. "B" and "C" Players

[My corresponding LinkedIn post]

Today's meditation:
If you think there is no distinction between "A" players vs. "B" and "C" players - either you have not been around long enough - or you lack the basic skills to assess quality talent.

Here are some suggested clues to help you identify the players:

"A" players:
1. Execute consistently
2. Deliver results
3. Their quality is consistently exceptional
4. Can quickly assess/identify other "A" players
5. They actively and instinctively mentor others - and can help elevate a "B" to "A" level; or a "C" to "B"
6. Their presence can elevate an entire organization
7. They break logjams
8. Insatiably curious - constantly expanding/renewing their skills.
9. Actively seek to collaborate, communicate, document, share
10. Easily and quickly focus on what matters, what will move the needle, what is essential

"B" players:
1. Execute inconsistently
2. Frequently make excuses for why they didn't deliver
3. Their quality is not consistently of a high degree
- sporadically produce exceptional results
4. Have difficulty discerning "A" vs. "B" talent - and will sometimes end up hiring "C" players
5. Have difficulty mentoring others - or lack the interest/initiative/drive to mentor others
6. In the absence of any "A" players - they can actively impede the growth of an organization
7. They nibble at logjams
8. Minimal investment in personal growth, very low level of curiosity, skills atrophy over time.
9. Expend the minimum effort in collaboration, communication, documenting, and sharing
10. Have trouble identifying what matters, where to focus, what will move the needle, and what is essential.

"C" players
1. Consistently fail to execute
2. Consistently fail to deliver
3. Their quality is consistently at a sub-optimal level

4. Do not realize their incompetence (re: Dunning–Kruger)
5. Actively impede attempts to mentor/improve a team
6. Are only able to hire "C" and "D" players ("A" and "B" players will decline job offers from a "C" player)
7. They create logjams
8. No curiosity, have no interest in investing in personal/professional growth, skills are consistently insufficient for their role.
9. Consistently demonstrate zero effort in collaboration, communication, documenting, sharing.
10. Excel at focusing on things that do not matter, that create the appearance of work - but does not actually produce value, and have no clue what is essential. 



Suggested Reading

2005 

2024

"As I type this I realize it may not be the wisest to categorize everyone into 3 buckets but this is how I believe we should look at everyone a part of the production team. You’re either an A-Player, B-Player, or C-Player. There is only room in this company for A-Players. A-Players are obsessive, learn from mistakes, coachable, intelligent, don’t make excuses, believe in Youtube, see the value of this company, and are the best in the goddamn world at their job. B-Players are new people that need to be trained into A-Players, and C-Players are just average employees. They don’t suck but they arn’t exceptional at what they do. They just exist, do whatever, and get a paycheck. They arn’t obsessive and learning. C-Players are  poisonous and should be transitioned to a different company IMMEDIATELY."

 

 

2023-10-06

2023-10-06 Friday - Today's meditation: On the value of "Wall Walks"

[Image by meineresterampe from Pixabay.com]


Today's meditation: On the value of "Wall Walks"

A Wall Walk - is a technique for breaking siloed thinking, for encouraging innovation, for identifying dependencies & risks, and for encouraging open communication & collaboration.

It is a periodic meeting (quarterly usually feels like a good cadence - however, during periods of rapid change - monthly may be appropriate), that pulls together participants from all of the disciplines across a company - and each area is given [n] minutes to give a brief talk, with a question & answer session following.

What makes the Wall Walk *fundamentally* different from almost every other presentation you will see in any company - is that it isn't intended as an opportunity for the team to proclaim their glorious achievements - or show how many areas they are reporting as GREEN to management (when in reality, we all know, some of them are actually RED).

The goal for a Wall Walk talk should be to cover:
- What we recently delivered
- What we we are working on - and how it may impact the rest of you
- Experiments we've tried - what worked - and what didn't
- *Challenges* we are struggling with - would love to have follow-ups to hear your ideas
- Future planned work - in areas in which we know (or believe) that there will be dependencies that impact you.

To implement Wall Walks requires courage - and a willingness to tell the unvarnished truth.


Other variations on the concept of Wall Walks:

2023-10-02

2023-10-02 Monday - Research Notes: Hoshin Kanri ("Compass Management") and X-Matrix in strategic planning

 

[image credit: Clker-Free-Vector-Images on Pixabay]


  
https://en.wikipedia.org/wiki/Hoshin_Kanri

  • ...a 7-step process used in strategic planning in which strategic goals are communicated throughout the company and then put into action.
  • The Hoshin Kanri strategic planning system originated from post-war Japan, but has since spread to the U.S. and around the world. Translated from Japanese, Hoshin Kanri aptly means "compass management". The individual words "hoshin" and "kanri" mean direction and administration, respectively.
  • Hoshin Kanri requires a strategic vision in order to succeed. 
  • From there, strategic objectives need to be clearly defined, with goals being written for long periods of a one to five-year-long timeframe
  • Once the long term timeframe goals are completed, the team can focus on yearly objectives
  • Management needs to avoid picking too many vital goals in order to stay focused on what is strategically important
  • Hoshin Kanri is a top-down approach, with the goals being mandated by management and the implementation being performed by employees.
  • Companies that use Hoshin Kanri often follow a Think, Plan, Implement, and Review process, which is comparable to W. Edwards Deming's Plan Do Check Act cycle

    
If you are pressed for time, read this:

The Ultimate Guide to Strategy Deployment using Hoshin Kanri (X-Matrix)
https://www.linkedin.com/pulse/ultimate-guide-strategy-deployment-using-hoshin-kanri-vetriko/
(See triangle in diagram, Principles of Hoshin Kanri - nice graphics)

 

2023-09-30

2023-09-30 Saturday - On the distinctions between Functional and Non-Functional Requirements

 

[Image by Alexa from Pixabay]


The seed for the genesis for this blog post started with a LinkedIn post by Dave Farley, with his link to his recent YouTube video: "Non-Functional Requirements" Are STUPID - and some of our subsequent LinkedIn exchanges (1, 2, 3 - as well as 4, and 5 - which I've incorporated into the initial content for this blog post)

See link to my initial comment on his post, #1:

Possible examples of counter-arguments:

1. Some NFRs are cross-cutting - and should be consistently applied enterprise-wide, across all domains, all applications.
2. Repeating the definitions, in vertical contexts - violates the DRY principle.
3. Enterprise-level NFRs provide a consistent reference - that can be reused across initiatives, products, programs, projects, applications, etc.
4. When categorizing everything as just a "requirement" (with no distinction between technical/NFR vs. functional) - increases the complexity and cognitive load - when trying to get a business stakeholder - to focus on reviewing/approving just the business requirements.

Example:
NFR: "All NPI, PII, PHI, PCI information must be encrypted at-rest, and in-transit, with encryption standards specified in INFOSEC-STANDARD-001."


See link to my follow-up reply, #4:

Dave Farley I think revisiting some definitions might be helpful for others in considering this discussion.

I will stipulate that if someone reading this (besides Dave) doesn't value architecture, or understand the decomposition and importance of layers in architecture - they may struggle with understanding the intended purpose for making a distinction between functional and non-functional types of requirements.

If you are only focused on what is in the next sprint - and never do any requirements analysis or design - this discussion will be moot for you.

Oxford English Dictionary (OED):

Functional:

  1. of or having a special activity, purpose, or task; relating to the way in which something works or operates.

  2. designed to be practical and useful, rather than attractive.

  3. working or operating.



Also see ISO 25010
https://iso25000.com/index.php/en/iso-25000-standards/iso-25010

 

https://en.wikipedia.org/wiki/Functional_requirement

(selected citations)

  • In software engineering and systems engineering, a functional requirement defines a function of a system or its component, where a function is described as a summary (or specification or statement) of behavior between inputs and outputs.

  • Functional requirements may involve calculations, technical details, data manipulation and processing, and other specific functionality that define what a system is supposed to accomplish

  • Functional requirements are supported by non-functional requirements (also known as ‘quality requirements’), which impose constraints on the design or implementation (such as performance requirements, security, or reliability). Generally, functional requirements are expressed in the form ‘system must do <requirement>,’ while non-functional requirements take the form ‘system shall be <requirement>.’

  • The plan for implementing functional requirements is detailed in the system design, whereas non-functional requirements are detailed in the system architecture

  • As defined in requirements engineering, functional requirements specify particular results of a system.

  • ...contrasted with non-functional requirements, which specify overall characteristics such as cost and reliability.

  • Functional requirements drive the application architecture of a system, while non-functional requirements drive the technical architecture of a system


https://www.altexsoft.com/blog/business/functional-and-non-functional-requirements-specification-and-types/

  • Functional requirements are product features or functions that developers must implement to enable users to accomplish their tasks

  • Nonfunctional requirements, not related to the system functionality, rather define how the system should perform.” 

     

     

2023-09-10

2023-09-10 Sunday - Book Review: Azure Data and AI Architect Handbook

[image source: Amazon.com]

Book Title: 

Azure Data and AI Architect Handbook: Adopt a structured approach to designing data and AI solutions at scale on Microsoft Azure 

Book Details:
Pages: 284
Publication Date: 2023-07-31 (August 2023, in the book)
 

Author(s):

[Link to my LinkedIn post referencing this review]

My Review Summary:

Mostly, an excessively long, Microsoft marketing brochure
 

My Review Commentary:

As I read through the chapters of this book, the thought that kept coming to my mind:
 
"It's like reading diluted & neutered sets of Microsoft Azure documentation" (i.e., no rich cross-linking to additional relevant content - and almost no hands-on examples)
 
Read on, for why I had that feeling...


At 284 pages (but only 245, if we exclude the Index)  – this book impressively attempts to cover a wide range of information that will be of interest to anyone that wishes to establish an architect-level awareness of Azure data and AI architecture capabilities.

Note: For my review – I read a PDF version of the book that I downloaded from Packt’s web site, AFTER the publication date of the book.

Three key criticisms I have - with almost the entire book: 

  • A significant lack of additional suggested reading links (beyond just the paltry few citations of Microsoft Azure documentation). There is a severe dearth of reference to other related material, articles, books, research papers - that would deeply enrich the reader's experience - and magnify the educational value of this book.
  • With the noticeable exception of Chapter-8, there is a severe paucity of actual detailed examples in the majority of the book's pages. 
  • The lack of a companion github repository - providing hands-on examples.


This book suffers from a lack, in almost all chapters, of any  in-depth, detailed discussion – of real-world examples & case studies. In Chapter-3 (Page-39), fraud detection is briefly mentioned – and would have made an EXCELLENT example / case study – on which to elaborate in that chapter.

In almost every instance  – the reader would be better served by simply reading the Microsoft Azure documentation – rather than the diluted treatment given to many topics in the various chapters – most of which lack the basic courtesy of pointing the reader to the appropriate online documentation landing page, for the services discussed.

What I liked

Chapter-3’s discussion of Kappa and and Delta lake architectures. 

Chapter-6’s coverage of Data Warehousing (this is the best-written chapter in the entire book, and provides detail examples to clearly explain concepts).

What could be improved in the next edition:

Better use of color – and consistent use of color - in diagrams.

Page-xvi, hyperlink to errata page is not enabled.

MAJOR MISS: Inclusion of a companion github project for the book, to provide some hands-on exercises.

Chapter-1 (page-4): The first sentence of this book, published in July/August 2023 - refers to some growth predictions, in the past..."Data generation is growing at an exponential rate. 90 percent of data in the world was generated in the last 2 years, and global data creation is expected to reach 181 zettabytes in 2022".  A better quote would be to show the expected growth by 2030, at the very least.

Chapter-1 (Page-7): The Data Architecture reference diagram does not reflect a “Data orchestration and processing” layer – but this is called out in the bullet list enumeration of diagram elements.

Chapter-1 (Page-8): Appears to still have some internal / editor reminder note embedded in the text, re: “(Add what data ingestion services will be discussed later in the book).”

Chapter-1 (Page-9): Appears to still have some internal / editor reminder note embedded in the text, re:“(Add what data storage services will be discussed later in the book).”

Chapter-1 lacks any suggested links, additional reading – to enrich the reader’s experience.  
NOTE: This criticism holds TRUE for the MAJORITY of the book's chapters.

Chapter-1 is missing a section to introduce the fundamental concepts of Data Architecture Principles

Chapter-1 would benefit from having a table to provide a comparison of the capabilities across the major Cloud Service Providers (CSPs) – i.e., Azure, AWS, GCP.

Microsoft’s choice of the acronym WAF (for Well-Architected Framework) – is unfortunate – as it could easily be confused with the more common usage (Web Application Firewall).  For example, on page-18, there is an [incorrect] link to (“Azure Well-Architected Framework review - Azure Application Gateway v2” documentation) – that clearly refers to ”WAF” in the context of  a Web Application Firewall (“Be aware of Application Gateway capacity changes when enabling WAF”)

Chapter-2 (Page-18) – The hyperlink to Microsoft Azure WAF documentation page is incorrect, and not enabled.

Chapter-2 (Page-18) – There is supposed to be a link to refer the reader to the Well-Architected Framework (WAF) main page (re: “For the complete framework…”)  – but the link that is provided – is to a sub-page– referring to Application Gateway concerns – “Azure Well-Architected Framework review - Azure Application Gateway v2”.

Chapter-2 (Page-23) - The section on cost optimization discussion – would be better placed near the end of the book, in a dedicated chapter for that topic.

Chapter-2 (page-23) - The advice to “Whenever possible, look for cloud-native offerings to offload your workloads.” – seems incongruent with the section’s focus on cost optimization. If you don’t have significant variability in your scaleability requirements – and you have sufficient compute power in an existing data center – you may be able to more efficiently manage some CPU/memory intensive workloads – on your existing data center hardware.

Chapter-2 would greatly benefit by having some illustrative worked examples of the costs for different cost variances – based on different deployment choices of some simple Data Architecture examples. Instead of saying it can vary across regions, or network ingress/egress can increase costs, or hosting in different regions can increase latencies. In particular, citing some actual examples from the barely mentioned Azure calculator, and Total Cost of Ownership (TCO) calculator.

Chapter-2 (page-27) - The very brief discussion of “Using data partitioning” – would be much better if it included a discussion of the why, for each strategy mentioned.

Chapter-2 (page-29) – The enumeration of the concepts of Subscriptions, Resource groups, and Management groups – is not in the same order as the hierarchy depicted in the corresponding diagram – which introduces confusion – and needless burden on the reader to mentally CORRECT what they may have thought was safe to infer from the ordering of the list. Rule #1: Make learning EASY for the reader

Chapter-2 (page-29) - the book still refers to the old name ("Azure Active Directory (AAD)"). It should be updated to reflect the new name ("Microsoft Entra ID") - that was announced July 11th, BEFORE the book was published.

Chapter-2 (page-30) – “The architecture of the data management landing zone is quite extensive and may be hard to clearly visualize in this book” – supports my belief that this book should actually be closer to 450-650 pages in length.

Chapter-2 (page-30) the link to the data management landing zone is not hyperlink enabled – and when the text is copied – it mangles the link, putting parts of the URL out of their correct order.

Chapter-2 (page-31): "Services shown in color are mandatory for the landing zone, whereas services that appear in gray are optional" re: Fig 2.2. Is *very* confusing - as there doesn't appear to be any services colored gray. The only thing  gray - are the layers. There appear to only be services in either black, or reddish-orange.

Chapter-3 discusses different strategies for ingestion – but the decision criteria is often embedded in paragraphs - a decision-tree or decision criteria would perhaps be beneficial to help communicate the information more visually.  This would be especially helpful when there are more than two possible choices discussed.

Chapter-3 (page-51): The term SHIRs is introduced, and is defined as self-hosted IRs. However, nowhere in the previous pages, was IR defined as an acronym.  For the benefit of the reader, the full term should be defined here as Self-Hosted Integration Runtime.

Chapter-3 (page-57): The discussion on Event Hub should include a link to the “Azure Event Hubs quotas and limits”) in the Azure documentation.

Chapter-6 (page-135): The reference to “The data vault method” – should provide the proper attribution to its creator: "The author of the third approach to the subject of the data warehouse, known as the Data Vault, is Dan Linstedt. The Data Vault is the result of 10 years of his research efforts to ensure the consistency, flexibility and scalability of the warehouse. The first results of his research in this field are five articles on this subject, which were published in 2000. In contrary to Inmon’s view, Linstedt assumes that all available data from the entire time period should be loaded into the warehouse. This is known as the “single version of the facts” approach. As with Kimball’s star schema, with the Data Vault Linstedt introduces some additional objects to organize the data warehouse structure. These objects are referred to as the hub, satellite and link". [source]

Chapter-7 (page-144): "Figure 7.6 – Power BI Premium as a superset of AAS", the light-colored font is *much* more difficult to read.

Chapter-7 should introduce the concepts of taxonomy and ontology – and provide reference to some public domain examples.

For example:


Chapter-8 (page-154): The link to the pricing for Power BI is __very__ incongruent with the *complete* lack of reference to any links for other service pricing details – as well as the lack of any citation in the book to the __very important__ documentation links for service-specific Quotas and Limits

Chapter-8 itself – feels like it is VERY out-of-place, and does not feel like it belongs in an ARCHITECT book. It is written to a level of detail for a DEVELOPER, that I WISH the *PREVIOUS* 7 chapters had demonstrated.

Chapter-8 begs the question – why does it delve into the development details – when none of the previous chapters have touched on such matters?  

Chapter-9 (pages 185-187): Discusses Azure Cognitive Services (re: Speech, Vision) – but doesn’t connect the dots to how this applies to Data Architecture.  Further, the level of discussion barely goes beyond “brochure-ware” – and smells of a marketing ploy – not a chapter intent on teaching how to use the Azure AI services.

Chapter-9 (189-…): Begins discussing the “Azure OpenAI Service” – and though it makes a vague reference to *some* hallucination concerns– it DOES NOT cite the relevant OpenAI papers:  GPT-4 Technical Report (27 March 2023); nor the GPT-4 System Card (27 March 2023) – that latter of which, specifically includes this explicit warning: “In particular, our usage policies prohibit the use of our models and products in the contexts of high risk government decision making (e.g, law enforcement, criminal justice, migration and asylum), or for offering legal or health advice.

Chapter-10: Does not provide any links to the relevant standards that are cited (i.e., DCAM, DAMA DMBOK)

Chapter-11 (page-228): states “The only significant choice to make here is which version of the TLS protocol to choose: TLS 1.0, TLS 1.1, or TLS 1.2”.  This ignores the fact that TLS 1.0 and TLS 1.1 have been deemed to be vulnerable  – and TLS 1.2 should be minimally enforced. Further, this sentence should include TLS 1.3.  The appropriate NIST paper for TLS should be cited for exclusion of TLS 1.0 and TLS 1.1, and the NIST recommendation/guidance for adoption of TLS 1.2, and TLS 1.3.

Book's companion Github repository:

N/A - completely missing

*My* Additional Suggested Microsoft Documentation References:

  1. https://learn.microsoft.com/en-us/azure/architecture/data-guide/
  2. https://learn.microsoft.com/en-us/azure/architecture/data-guide/big-data/
  3. https://learn.microsoft.com/en-us/azure/architecture/example-scenario/data/data-warehouse/
  4. https://learn.microsoft.com/en-us/azure/architecture/solution-ideas/articles/enterprise-data-warehouse/
  5. https://learn.microsoft.com/en-us/azure/architecture/solution-ideas/articles/advanced-analytics-on-big-data/
  6. https://learn.microsoft.com/en-us/azure/architecture/reference-architectures/data/enterprise-bi-adf/
  7. https://learn.microsoft.com/en-us/azure/architecture/example-scenario/data/small-medium-data-warehouse/
  8. https://learn.microsoft.com/en-us/azure/architecture/example-scenario/analytics/enterprise-bi-synapse/
  9. https://learn.microsoft.com/en-us/azure/architecture/example-scenario/dataplate2e/data-platform-end-to-end/ 
  10. https://learn.microsoft.com/en-us/azure/storage/common/storage-service-encryption 
    1. Data in Azure Storage is encrypted and decrypted transparently using 256-bit AES encryption, one of the strongest block ciphers available, and is FIPS 140-2 compliant.”
    2. https://learn.microsoft.com/en-us/windows/win32/seccng/cng-portal
      1. "Cryptography API: Next Generation (CNG) is the long-term replacement for the CryptoAPI. CNG is designed to be extensible at many levels and cryptography agnostic in behavior."
    3. https://en.wikipedia.org/wiki/Advanced_Encryption_Standard 
      1. At present, there is no known practical attack that would allow someone without knowledge of the key to read data encrypted by AES when correctly implemented.
*My* Additionally suggested background reading:
  1. Building a Scalable Data Warehouse with Data Vault 2.0  (2015, by Dan Linstedt, and Michael Olschimke)
  2. https://www.snowflake.com/resource/5-best-practices-for-data-warehouse-development/
  3. https://www.astera.com/type/blog/data-warehouse-concepts/
  4. https://www.geeksforgeeks.org/data-warehouse-architecture/ 
  5. https://www.geeksforgeeks.org/difference-between-kimball-and-inmon/ 
  6. https://medium.com/cloudzone/inmon-vs-kimball-the-great-data-warehousing-debate-78c57f0b5e0e
  7.  https://www.incorta.com/blog/death-of-a-star-schema-redux-moving-beyond-inmon-and-kimball
    1. Historically, there were two models to choose from: Ralph Kimball’s “bottom-up” approach to mapping atomic data or Bill Inmon’s “top-down” model. In recent years, however, the technology that supports BI and data warehousing has evolved rapidly. Now, there is a third option for data warehousing and BI in a post-star-schema, post-ETL world: non-dimensional data modeling.
    2.  https://go.incorta.com/recording-death-of-the-star-schema
  8. https://www.nearshore-it.eu/articles/technologies/data-warehouse-architecture/
    1. Data warehouses are inextricably associated with the American computer scientist Bill Inmon, born in 1945, who is widely considered the father of the data warehouse. In 2007, Bill Inmon was named by Computerworld as one of the ten people who have had the most significant impact on IT development in the past 40 years. In 1992, Inmon defined the data warehouse as follows:
      1. "A data warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of management’s decision-making process
    2. Next to Inmon, Ralph Kimball, born in 1944, is another key figure in the field of data warehousing. Unlike Inmon’s definition of a data warehouse, where the emphasis is on the characteristics of the warehouse, Kimball focuses on its purpose: “a copy of transaction data specifically structured for query and analysis.”
    3. The author of the third approach to the subject of the data warehouse, known as the Data Vault, is Dan Linstedt. The Data Vault is the result of 10 years of his research efforts to ensure the consistency, flexibility and scalability of the warehouse. The first results of his research in this field are five articles on this subject, which were published in 2000. 
      1. "In contrary to Inmon’s view, Linstedt assumes that all available data from the entire time period should be loaded into the warehouse. This is known as the “single version of the facts” approach. As with Kimball’s star schema, with the Data Vault Linstedt introduces some additional objects to organize the data warehouse structure. These objects are referred to as the hub, satellite and link."
  9.    https://www.analytics8.com/blog/is-dimensional-data-modeling-still-relevant-in-the-modern-data-stack/
    1. Is dimensional data modeling still relevant in the modern data stack? 
      1. Yes—specifically for defining requirements and creating a modular solution presenting data for analytics.
  10.  In 2017, Gartner estimated that 60% of data warehouse implementations would have only limited acceptance or fail entirely.  
    1. https://www.gartner.com/en/newsroom/press-releases/2015-09-15-gartner-says-business-intelligence-and-analytics-leaders-must-focus-on-mindsets-and-culture-to-kick-start-advanced-analytics 
  11. YouTube: Kimball in the context of the modern data warehouse: what's worth keeping, and what's not 
    1. https://www.youtube.com/watch?v=3OcS2TMXELU 
  12. Innovative Approaches for efficiently Warehousing Complex Data from the Web
    1. https://arxiv.org/abs/1701.08643
  13. Toward a New Approach for Modeling Dependability of Data Warehouse System
    1. https://arxiv.org/abs/1311.1181 
  14. The End of an Architectural Era for Analytical Databases
    1. https://arxiv.org/abs/1209.1425
  15. An Approach to Handle Big Data Warehouse Evolution (2018) 
    1. https://arxiv.org/abs/1809.04284 
  16. On building Information Warehouses
    1. https://arxiv.org/pdf/0910.2638.pdf 
  17. A new paradigm for accelerating clinical data science at Stanford Medicine
    1. https://arxiv.org/abs/2003.10534 
    2. Abstract: "Stanford Medicine is building a new data platform for our academic research community to do better clinical data science. Hospitals have a large amount of patient data and researchers have demonstrated the ability to reuse that data and AI approaches to derive novel insights, support patient care, and improve care quality. However, the traditional data warehouse and Honest Broker approaches that are in current use, are not scalable. We are establishing a new secure Big Data platform that aims to reduce time to access and analyze data. In this platform, data is anonymized to preserve patient data privacy and made available preparatory to Institutional Review Board (IRB) submission. Furthermore, the data is standardized such that analysis done at Stanford can be replicated elsewhere using the same analytical code and clinical concepts. Finally, the analytics data warehouse integrates with a secure data science computational facility to support large scale data analytics. The ecosystem is designed to bring the modern data science community to highly sensitive clinical data in a secure and collaborative big data analytics environment with a goal to enable bigger, better and faster science."

From the Amazon Listing:

"With data’s growing importance in businesses, the need for cloud data and AI architects has never been higher. The Azure Data and AI Architect Handbook is designed to assist any data professional or academic looking to advance their cloud data platform designing skills. This book will help you understand all the individual components of an end-to-end data architecture and how to piece them together into a scalable and robust solution."

"You’ll begin by getting to grips with core data architecture design concepts and Azure Data & AI services, before exploring cloud landing zones and best practices for building up an enterprise-scale data platform from scratch. Next, you’ll take a deep dive into various data domains such as data engineering, business intelligence, data science, and data governance. As you advance, you’ll cover topics ranging from learning different methods of ingesting data into the cloud to designing the right data warehousing solution, managing large-scale data transformations, extracting valuable insights, and learning how to leverage cloud computing to drive advanced analytical workloads. Finally, you’ll discover how to add data governance, compliance, and security to solutions."

"By the end of this book, you’ll have gained the expertise needed to become a well-rounded Azure Data & AI architect."

What you will learn

  • "Design scalable and cost-effective cloud data platforms on Microsoft Azure"
  • "Explore architectural design patterns with various use cases"
  • "Determine the right data stores and data warehouse solutions"
  • "Discover best practices for data orchestration and transformation"
  • "Help end users to visualize data using interactive dashboarding"
  • "Leverage OpenAI and custom ML models for advanced analytics"
  • "Manage security, compliance, and governance for the data estate"

Who this book is for

"This book is for anyone looking to elevate their skill set to the level of an architect. Data engineers, data scientists, business intelligence developers, and database administrators who want to learn how to design end-to-end data solutions and get a bird’s-eye view of the entire data platform will find this book useful. Although not required, basic knowledge of databases and data engineering workloads is recommended."

 

 

Copyright

© 2001-2021 International Technology Ventures, Inc., All Rights Reserved.