From the Principal of OnStrategies

Tony Baer

Subscribe to Tony Baer: eMailAlertsEmail Alerts
Get Tony Baer: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Top Stories by Tony Baer

It’s no secret that rocket .. err … data scientists are in short supply. The explosion of data and the corresponding explosion of tools, and the knock-on impacts of Moore’s and Metcalfe’s laws, is that there is more data, more connections, and more technology to process it than ever. At last year’s Hadoop World, there was a feeding frenzy for data scientists, which only barely dwarfed demand for the more technically oriented data architects. In English, that means: Potential MacArthur Grant recipients who have a passion and insight for data, the mathematical and statistical prowess for ginning up the algorithms, and the artistry for painting the picture that all that data leads to. That’s what we mean by data scientists. People who understand the platform side of Big Data, a.k.a., data architect or data engineer. The data architect side will be the more straightforw... (more)

Another Vote for the Apache Hadoop Stack

As we’ve noted previously, the measure of success of an open source stack is the degree to which the target remains intact. That either comes as part of a captive open source project, where a vendor unilaterally open sources their code (typically hosting the project) to promote adoption, or a community model where a neutral industry body hosts the project and gains support from a diverse cross section of vendors and advanced developers. In that case, the goal is getting the formal standard to also become the de facto standard. The most successful open source projects are those t... (more)

Informatica's Stretch Goal

Informatica is within a year or two of becoming a $1 billion company, and the CEO’s stretch goal is to get to $3b. Informatica has been on a decent tear. It’s had a string of roughly 30 consecutive growth quarters, growth over the last 6 years averaging 20%, and 2011 revenues nearing $800 million. Abbasi took charge back in 2004, lifting Informatica out of its midlife crisis by ditching an abortive foray into analytic applications, instead expanding from the company’s data transformation roots to data integration. Getting the company to its current level came largely through a seri... (more)

Fast Data Hits the Big Data Fast Lane

Of the 3 "V’s” of Big Data – volume, variety, velocity (we’d add "Value” as the 4th V) – velocity has been the unsung ‘V.’ With the spotlight on Hadoop, the popular image of Big Data is large petabyte data stores of unstructured data (which are the first two V’s). While Big Data has been thought of as large stores of data at rest, it can also be about data in motion. "Fast Data” refers to processes that require lower latencies than would otherwise be possible with optimized disk-based storage. Fast Data is not a single technology, but a spectrum of approaches that process data t... (more)

Big Moves in Big Data: EMC's Hadoop Strategy

To date, Big Storage has been locked out of Big Data. It’s been all about direct attached storage for several reasons. First, Advanced SQL players have typically optimized architectures from data structure (using columnar), unique compression algorithms, and liberal usage of caching to juice response over hundreds of terabytes. For the NoSQL side, it’s been about cheap, cheap, cheap along the Internet data center model: have lots of commodity stuff and scale it out. Hadoop was engineered exactly for such an architecture; rather than speed, it was optimized for sheer linear scale.... (more)