Here are my raw, unedited notes from Ian Robinsons' QCon San Francisco 2008 talk, scribed by the wonderful Ken Kolchier since Ian was doing his presentation on my laptop...
fictional example as representative context
archived feeds to improving cacheable aspects, and why/when?
how to implement a solution
"Restbucks"/"How to get a cup of coffee" article on InfoQ, reframed version of Gregor's article, framed into more of a REST modality.
whole bunch of back-office functionality involved: Order Management (supervising the fulfillment of an order), Inventory (maintaining inventory within a store), Product Management, Regional Distribution (distributing to the various locations), etc.
"Terrorist Cell Services" - self contained to do all the work it needs to, as opposed to a 3-layered architecture blown into a Service deployment.
for Services to do their job, who needs to know what?
Order Management needs to have knowledge of Product to do its job...
lots of stuff, even a distributed app, might constitute the internals of one of these Services, yet this discussion is about the Service boundary interactions.
options for implementation: point-to-point, bus, and consumer-pull-events
most of today's solutions tend towards the "bus" type of solution as middleware.
with a bus, the publisher doesn't need to know anything about the particular consumers. the bus reduces locational and temporal coupling.
with consumer-pull, the guaranteed delivery aspect is delegated to the consumers themselves, so there is no subscriber list needed on the publisher side.
today the main implementation decision is between the bus and consumer-pull options.
RESTbucks is rather British, so "no public displays of affection", so consumer-pull is in order.
Polling an Atom feed, so Product Management having an Atom feed of updates, that the Order Management (client) polls, and sticks into its internal 3rd-party app.
This makes sense, because "eventual consistency" is what's needed, as there are no low-latency requirements, thought there is a guaranteed delivery need.
Atom Syndication format
answers the question "how might we represent interesting web resources", whereas the atom publication protocol answers "how do we go about creating and updating those resources?"
xml vocabulary for representing document-like resources.. title, pub date, author, etc.
a feed is a directory/list of resources. an entry is a resource, with an timestamp, but not necessarily time-order.
feed, entry, and content.
you can use the entry as a simple envelope, with the content itself is the actual interesting elements of an entry.
the metadata of the entry can be used as domain resource representation if the feed is exposing events as the resources
Atom Pub Protocol
an application protocol, built with HTTP (another application protocol), uses Location Headers, ETags, etc.
core resources described by the protocol: collection, member (entry or link).
everything is time-ordered, and introduces the "edited" element, which is what the time-ordering is arranged by.
is there still an "updated" element?
Collections are atom feeds, Members are Atom elements
Publishing and consuming - not necessarily the same feeds
"/product-catalog", "/promotions/notifications", "/products/notifications"
consumers and updators would likely have different interest, with slightly different uris.
category elements, to enable clients ability to filter for "themes of interest"
link element contains an href that's a stand-alone resource representation of the entry. entries are first-class citizens, they don't need to exist in a feed.
the self link is a canonical uri. entries likely represent just the current-state-of affairs (maybe the last hour) with older events being archived. the "prev-archive" is a uri to the last archived set.
on-the-wire view
cache-control header makes sense here, since there is no low-latency requirement, and gives a hint to clients about polling interval
content-location header is the canonical uri
etags are a hash so that we can do a conditional GET later on
events that have occurred are immutable. for archived (last hour) feeds, the cahche-ability is a lot longer. for the current hour's feed, the individual entries are cacheable long-term, but the feed will likely change.
question: so if something in the last hour's feed is edited, that archive doesn't change? no. the entires representing events wouldn't change. if we edit the underlying domain model, that would generate a new event.
win-inet cahce, same that internet explorer uses, for lcoal cache?
trolling through "previous" navigation links, until we find the last entry that we've applied internally, then can work our way forward
as we follow the "next" link, since we already hit those links navigating backwards, the "next" pulls will likely be found in the local cache
handling eager re-polling
conditional get "If-None-Match"
links to resource representations, and versioning information in the etags.
all this can be handled by a nice generic atom client.
we want to be able to reuse a media handler
atom+xml tells us some things about what we can expect to be able to do with the links that are in it
application/xml doesn't give us any such information, so we generally have to resort to schema defs, etc.
we could use RDF or XHTML
we could also use that "out of band processing information" and put together a custom media type, i.e. "application/restbucks+xml"
if custom media types start proliferate, they start to be less useful, as reusable handlers start becoming less reusable.
when to use feeds?
when most of the clients are interested in most of the entries
if not, feeds might not be appropriate.
for per-client interest, entries can still be used (the stand-alone representation resource), just not likely put into a feed.
caching proxy helps reduce the overall latency, helps mask intermittent network failures of the Service itself (resiliency, scaling, etc)
caching dilemma: HighTTL/LowTTL
cache channels -- using atom to deal with freshness, channel attribute added to the cahce-control header
question: what if no events have occurred yet in the current hour? would you get an empty feed? yes, with a link to the prev archive
can you reduce the size of the feed text by using rel url's rather than absolute ones? the spec does allow for that