Sunday, November 23, 2008

2008-11-21 Friday - QCon 2008 Morning

Disclaimer: There may be some many mistakes/errors in my blog notes for QCon over the next few days - as I am writing these posts while I'm sitting in sessions - and much of these notes are stream-of-consciousness - as I try to keep up with the presenters.

I'm in San Francisco attending QCon 2008 this week.

Social Architectures

MySpace presentation
Dan Farino, Chief Systems Architect,

__The traditional distribution channels are being redefined...__
SMB Channels are out pacing large channels

Transacting within the community being redefined

Small producers can now compete with the large industries

__Social Fuzziness__
- the boundaries fo the social sites aree not as clear sa the preceeding slides suggest
-- myspace and facebook transactd
-- ebay has a thriving community
-- digg and ing are about networking

__Social Architecture Challenges__
- worry not that no one knows of you - seek to be worth knowing -- confucious

Shared joy is a double joy - shared sorrow is half a sorrow - swedish proverb

connections between people are transitive and lack affinity

partitioning for scale is always problematic

Tools are sparse for large-scale windows server management

Traditional Plan:
- Plan
- Implement
- Test
- Go Live
- Manage growth

MySpace plan as executed
- implement
- go live

- reboot servers - often
- "Shotgun debbugging"
-- a process of making relatively undirected changes to software in the hope that a bug will be perturbed out of existence.
-- need to resolve the problem now and collecting data for analysis would take too long

Windows 2000 Server

Operationally - where they were...
- batch files and robocopy for code deployment
- "psecec" for remote admin script execution
- Windows Performance Monitor for monitoring

No formal, automaoted QA process

Current Architecture
- 4,500 web serves, windows 2003 IIS 6.0 ASP.NET
- 1,200 "cache" servers - 64-bit windows, 2003 (key value pair, distributed)
- 500+ database servers

QA Today
- unit tests/automated testing
- don't fuzz the site nearly as thoroughly as the users do
- there are still problems that happen only in production

Ops Data Collection
Two types of systems:
- Static: collect, store, and alert based on pre-configured rules (e.g. Nagios)
- Dynamic: Write an ad-hoc script or application to collect data for an immediate or one-off need

(Very interesting: Windows Performance counter monitor display in their Ops Data Collection)

Cons of static system:
- relatively central configuration managed by a small number of administrators
- bad for one-off requests: change he config, apply, wait for data
- developer's questions usually go unanswered

Devlopers like to see their creations come to life

Cons of the dynamic system:
- it's not really a "system" at all - its an administrator running a script
- is a privileged operation: scripts are powerful and can be potentially make changes to the system
- even run as a limited userr, bad scripts can still DoS the system
- one-shot data collection is possible but learning about deltas takes a lot more code (and polling, yuck)
- different custom-data collection tools that request the same data point caused duplicated network traffic

*** They use Powershell a lot ***

Ideally, all operational data avaialble in the entire server farm should be queryable

New Operational Data-subscription platform for Ops Data Collection
- on-demand
- supports both "one-shot" and "persistent" modes
- can be used by non-privileged users
- a client makes __one__ TCP connection to a "Collector" server
-- can receive data related to thousands of servers via this one connection
-- like having all of the servers in a Jabber chat room and being able to talk to a selected subset of them at any time (over __one__ connection)

Agents provide...
- Windows Performance Counters
- WMI Objects
-- event logs
-- hardware data
-- custom WMI objects published from out-of-process
- Log file contents

On Linux, plans are to hook into something like D-Bus...

All C#, asynch I/O - never blocks a thread
Uses MS "Concurrency and Coordination Runtime"
Agent runs on each host
Wire protocol is Google's Protocol Buffers
Clients and Agents can be easily writtten to the Agents wanted to see if C#+CCR could handle the load (yes it can)

Why develop something new?
- there doesn't seem to be anything out there right now that fits the need
- requirements include free and open source

To do it properly - you really need to be using 100% async I/O
Libraries that make this easy are relatively new
- Twisted
- GTask ==> Need to research this (for Linux asynch callback processing)
- Erlang

What does this enable?
- the individual interested in the data can gather it theirself.
- its almost like exploring a database with some ad-hoc SQL queries
- "I wonder"...questions are easily answered without a lot of work
- charting/alerting/data-archiving systems no longer concern themselves with the data-collection intricacies
- we can spend time writing the valuable code instead of rewriting the same plumbing every time
- abstract physical server-farm from teh user
- if you know machine names, great - but you can also say "all servers serving..."
- guaranteed to keep you up to date
- get your initial set of data and then just wait for the deltas
- pushes polling as close to the source as possible
- eliminates duplicate requests
- hundreds of clients can be monitoring the "% processor time"
- only collects data that someone is currently asking for

Is this a good way to do it?
- having too much data pushed at you is a bad thing
- being able to pull from a large selection of data points is a good thing

For developers, knowing they will have access to instrumentation data even in production encourages more detailed instrumentation.

Easy and fun API's

Using LINQ: Language Integrated Query
LINQ via C# and CLINQ ("Continuous LINQ") = instant monitoring app (in about 10 lines of code)

var counters = ...
MainWpdfWindows.MainGrid = counters;
// go grab a beer

Tail a file across thousands of servers
- with filtering expression being run on the remote machines
- at the same time as someone else is (with no duplicate lines being sent over the network)
- multicase only to the people that are subscribed (?)

Open Source it?
- hopefully

Other implementations?
- may write a GTask / Erlang implementation

2008-11-21 Friday


Digg: An Infrastructure in Transition
Joe Stump, Lead Architect, Digg

35,000,000 unique
3,500,000 users
15,000 requests/sec
hundres of servers

"Web 2.0 sucks (for scaling)" - Joe Stump

What's Scaling?
- specialization
- severe hair loss
- isn't something you do by yourself

What's Performance?
- who cares?

Not necessarily concerned with how quick - but can they store everything they need - and return it in a reasonable amount of time.

Clusters of databases are designated as WRITE - others are READ

Server Tiers
- Netscalers
- Applications
- Lucene (for search)
- Recommendation Engine (graph database)
- MogileFS - distributed web-dav store
- Database servers clusters - which serve different parts of the site (LULX, AFK, ZOMG, WTF, ROFL)

A normal relationship database doesn't work well for certains types of views into your data

MySQL has problems under high write-load

Messaging framework
- XMPP? - stateless - can't go back in time
- Conveyor (allows rewind of data?)

- elastic horizontal partitions
- heterogenous partition types
- multi-homed
- IDs live in multiple places
- partitioned result sets

- Memached + BDB
-- supports replication, keyscans
- 28,000+ writes per second
- Persistent key/value storage
- Works with Memcached clients
- used in areas where de-normalization would require more writes than MySQL can handle

War Stories...
- Digg Images
-- 15,000 - 17,000 submissions per day
-- crawl for images, video, embeds, source, and other meta data
-- ran in parallel via Gearman <== Need to research this...

- Green Badges
-- 230,000 diggs per day
-- most active Diggers are also most followed
-- 3,000 writes per second
-- ran in background via Gearman
-- Eventually consistent