Blog: Richard HackathornFriday, 2 November 2007Where to Pick Your BattlesOver breakfast I had a delightful chat with Richard Buckle, an experienced, well-traveled student of the IT industry. He is quite a blogger, rather extensive in his comments and subtle (not!) in his humor. See his latest blogging creation. An issue that surfaced amid our eggs and pancakes was the impact of Web 2.0 technology. I was relaying my perceptions from the IBM IOD Conference. In particular, I was surprised by IBM's emphasis of Web 2.0 as an essential part of future enterprise architectures. I even queried a panel of IBM executives on the sanity of executing such flaking technology on sacred mainframe systems. The answer that I got involves the careful choosing of one's IT battles within the enterprise. Given the demands of today's global businesses and given the complexity of relevant information to the business, traditional IT has no hope to satisfy all those requirements. Doing IT as the same will result in a chaos far beyond the proliferation of user-created spreadsheet systems of the last decade. Using Web 2.0, leave the User Interface layer to the users, because each will want something different and will want it NOW. Choose, instead, battle lines around supplying quality enterprise information through a Service Oriented Architecture organized by key business processes. Hmmmm This is a new twist - a political one - to the whole SOA discussion. For more details, see Richard's blog. Friday, 31 August 2007The Black Swan of Business IntelligenceSometimes you start reading a book with low expectations about its significance. But, the book surprises you and delivers a message of great significance. That has happened with a new book entitled The Black Swan: The Impact of the Highly Improbable by Nassim Nicholas Taleb. He is a professor of the Sciences of Uncertainty (an odd title) at the University of Massachusetts. See his Wikipedia entry and a PBS podcast. Let me start with the bottom line. I strongly recommend this book for all professionals in Business Intelligence (BI) who care about the means and results of our profession upon our clients. I have this naïve belief that more information is better, assuming that the information is relevant to the business, properly cleansed, structured cross-functional, analyze appropriately, distributed to the right people and so on. This book totally negated that belief, instilling a humble attitude toward how much we can not know and shocking me about how much our current BI practices do damage to our clients. And... I have just read the first few chapters. I am starting to be aware of the problems in general, confused about their implications to BI, and wondering whether there are any solutions. This is a book that will take several months to consume (because you read a few sentences, think ‘what?’ and then reread it several more times). Let me give a small taste of Taleb’s argument. Before Australia was discovered, everyone knew that all swans were white, because all swans that were ever observed were white. Therefore, rule of nature was that all swans are white. Someone discovered a black swan in Australia. That one swan negated a belief held for a thousand years by all of mankind. Afterward, people concocted explanations as to why such a rare animal was perfectly normal and should have been expected. Taleb then extends this analogy to explain the events and aftermath of September 11, along with many other pivotal events in human history. That is the Black Swan. It is a totally unexpected, but significant, rare event that seems plausible...afterwards. In Taleb’s words, the Black Swan is an event with three attributes: “First, it is an outlier as it lies outside the realm of regular expectations, because nothing in the past can convincingly point to its possibility. Second, it carries an extreme impact [changing our basic paradigms that explain the world]. Third, in spite of its outlier status, human nature makes us concoct explanations for its occurrence after the fact, making it explainable and predictable.” I submit that we are unprepared to handle the Black Swan with current BI technology and practices. In fact, current BI does more harm than good, by giving us a false sense of reliability in what we think we know. Help me with my struggle to understand the practical importance of the Black Swan. I would like to get a discussion established on Black Swan issues within the BI profession, along with joint publications with some of you. Is there anyone interested in this pilgrimage? Thursday, 26 July 2007Data Warehouse Appliances: A Sign of the TimesI received an email from Tera-Tom, a good colleague who has established a solid reputation designing and implementing large-scale data warehouse (DW) systems based mainly on Teradata. His full name is Tom Coffing, CEO of Coffing Data Warehousing. His email stated that Tera-Tom is morphing into Netezza-Tom, DATAllegro-Tom, and even Greenplum-Tom. In other words, he is incorporating these DW appliance products into his system integration work. This announcement caught my attention, having recently completed a research study with Colin White on "Data Warehouse Appliances: Evolution or Revolution" for the BI Network. A key issue is whether DW appliances are mature and robust enough to support enterprise DW systems. Tom seems to think so! I chided Tom a bit by challenging him to declare that DW appliances are now respectable for enterprise DW by large corporations. I received the following reply: “There are some big companies like Amazon.Com that truly are using appliance vendors as their EDW. It depends on what queries are running. Because appliances have recently come of age, many companies already had their EDW in place. So, many of the largest, most loyal, and staunch supporters of traditional EDW vendors have begun implementing a multi-vendor strategy. Instead of allowing a single hardware vendor to hold them hostage, they have utilized appliances to perform certain EDW functions. It is a strategy that improves performance, saves costs, and gives them options and negotiating power. If you have seen the recent pricing on traditional EDW products, you might want to keep your options open. The bottom line is that traditional EDW vendors are no longer the only games in town. I thought I would never say that so I am almost as surprised as you.” So, Tom is observing that mature DW companies are moving to a multi-platform strategy that may appropriately include DW appliances. And, the reasons are not all technical. I think that the observations by Appliance-Tom are a sign of the times for enterprise data warehousing. They are a-changing... Tuesday, 24 July 2007BI and your Corporate Fantasy LeagueYou have heard of the sports fantasy leagues for soccer, basketball and American football. You pick the best players for your own team and then, depending on the actual performance of the players, your team is awarded points. It is a popular game, with over 15 million people involved generating about $2 billion annually, according to the Fantasy Sports Trade Association. An article in the Los Angeles Times caught my eye because fantasy games are being applied in other areas. There are now fantasy games about the US Congress, music recording, box office movies, fashion, celebrities, soap operas and even husbands. The last one was intriguing. You select from a pool of real-life husbands who are polled on what they would do in various relationship situations. If your fantasy husband called the divorce lawyer, you lose points! How does this relate to Business Intelligence in corporations? Imagine a fantasy game about your company. This ‘serious’ game is open to all employees who have dreams of running the company. Establish several performance criteria, like sales, cost-of-goods and new accounts for the month. Teams choose their mixture of corporate strategies. Based of predictive analytics from the BI system, the corporate simulator cranks through the numbers and computes a performance dashboard for each team. (Here is where the miracle occurs!) Calculated performance is compared to actual, so that the fantasy teams compete against the actual management team! Yuck it up! Give cash prizes for best performance. The Board of Directors might consider changes in top management slots based on the results. American idol meets Corporate America. It is the next-gen version of the suggestion box. Besides, it has the potential of getting employees to think more deeply about their company and realize that big decisions are tough ones. Monday, 23 July 2007Business by the Data – NOT!As professionals in Business Intelligence and Data Warehousing, we live and die by the creed of Business by the Data. Deep inside, we know that, if we do our job properly, our companies will be managed better and will perform successfully because with better data comes better decisions. Right? It is funny! After many decades of struggle, we are entering an era of BI/DW where that creed will be proven correct or exposed as a cruel lie. The evolving practice and technology of BI/DW are awesome! I never cease to be amazed at what innovative companies are doing and what innovative vendors are offering. Today there was a WSJ article (front left column of Marketplace) that caused me to think deeper on this issue. The article rambles but makes a useful point. “Managers can be so focused on perfecting today’s business that they lose sight of tomorrow’s” – that was a quote from a book by Sutton & Pfeffer entitled “Hard Facts, Dangerous Half-Truths & Total Nonsense”. The point is that the data used in management discussions to make decisions is often filled with misleading and even false assumptions. In particular, we focus too much on the past and (with real-time BI) the present, without thinking deeply and clearly about how our business will change next month and especially this week. Those constant changes are so fundamental to the nature of business. Data from our BI systems often clouds our better judgments, which a hundred years ago would have been labeled common sense. This echoes an old problem. Whatever we would print a report on the line printer, the data often seemed inconsistent or even mysterious, until we carefully analyzed the application programs that maintained the data. There buried in some COBOL clause was a hidden crucial assumption about how the data was to be used, created on a whim by a programmer meeting a deadline. Thus, the quest to capture meta-data merged. The same is now true of analytics. Buried in the calculations are crucial assumptions that permeate our dashboards. Guess what? Those assumptions are deeply hidden, may or may not be valid, and affect business performance in a big way. Therefore, Business by the Data – yes but be careful. It would be better to say Business by Insights, which can be stimulated by data from BI systems. Always ask why when viewing data. Clear thinking will never be replaced by sophisticated analytics. Wednesday, 27 June 2007Leveraging Human Intelligence into BIToday's WSJ contained an article on Human Computation - that right! Human Computation. This is the label that computer scientists give to system that incorporate low-level human judgments on a massive scale. Prof. Luis van Ahn of Carnegie Mellon create a game called the ESP game. The apparent objective seems to be to guess what another person is thinking. It works by pairing two random persons and showing the same photograph to both. You are supposed to enter the phrase that you think the other person is thinking. If they match, then both are awarded points. You get 3 minutes to describe 15 photos. Over 130,000 people have spent hours playing this game! The kicker is that the real objective of Prof. van Ahn is to categorize photographs with meaningful tags...all for free! As the WSJ article points out, he is pulling a 'Tom Sawyer' by enticing others to paint the fence of image recognition. A more generalized effort is the Mechanical Turk by Amazon.com to be the eBay for services. It is named after a fake chess-playing robot popular from 1770 to 1880. Read the Wikipedia entry; it is fascinating! Amazon coined the term Human Intelligence Task or HIT. A requester defines the HIT, such as transcribing a podcast or finding a company on the web. SOmeone can accept the HIT, perform the task. When accepted by the requester, the fee (usually less than an dollar) is deposited in your Amazon account. Another example is Rent-A-Coder, a marketplace with small but serious coding projects. The fees are typically around $100. So, what? How does this relate to BI/DW in corporations? There are a lot of HITs needed to maintain any BI system. The objective is to construct game that will leverage the intelligence of employees and even customers to categorize information, detect trends, and surface issues. Instead of limiting these tasks to the few having responsibility, open the task to a wider audience. BI/DW in Second LifeIn a recent article, I have argued that Virtual World technology can be a key enabler for the next generation of BI systems. I have also organized a group of local professionals to explore the potential of Virtual World technology at Serious Second Life Meetup of Boulder. So, last night I was wandered NOAA Island testing the new 3D Voice. I get a message from a random avatar named Leinad Meriman. We started to chat, and I discover that it is Dan Power of DSS Resources! He is quite active in Second Life and shares my feelings about the potential of this technology Shawn Rogers and I have created a group called Business Intelligence Network in Second Life. If you are registered in SL, please join our group. So far we have... - Richard Hackathorn as Hack Richard Can I add your name (and avatar) to this list? If you are interested in learning about SL, please contact me. I can help you through the initial hours of walking into walls as an avatar. :) Thursday, 15 February 2007Googling Business IntelligenceSeveral years ago, I wrote a book on Web Farming for the Data Warehouse. The focus was on the systematic refining of Web-based information resources for business intelligence, since critical information about a business was increasing created externally to the enterprise. Today Cognos hosted a web seminar with Google to explain how the Cognos 8 Suite leverages the Google OneBox . Other vendors currently using the Google OneBox include Cisco, Employease, Netsuite, Oracle, Salesforce.com, and SAS. Two aspects of the Google-Cognos solution were noteworthy. First, the Google OneBox synthesized unstructured data both inside and outside the enterprise in vocabulary of the business. The little tricks that Google does for its generic users at Google.com get specialized for the business. Second, the Google OneBox leverages the unstructured metadata about the BI implementation for a business. For instance, the report definitions to actual field values can be served up in a Google query result. So, a Cognos report can be a result item, which can be invoked directly. My gut said... This is a future glimpse at the integration of all enterprise information across traditional technology barriers. Saturday, 10 February 2007Rewiring the Enterprise with Yahoo! PipesEvery few years there is a technology innovation to which you just have this gut reaction of 'wow'. Yahoo! released its Pipes on Thursday. And, WOW! As with many innovations, the elements of the innovations are not new or unique in of themselves. However, the synthesis is much more than the sum of the parts. What are the elements of Yahoo! Pipes? 1) data flow diagrams with a nice web-based UI - nothing new! => Yahoo! Pipes - awesome new! Pipes is a bit hard to get into. Document is thin. So, be patient. Learn through examples. See the example by Nik Cubrilovic of TechCrunch to understand the simplicity and power of Pipes. Subscribe to its RSS feed for your NewsGator. Then, go mash it up! I created a simple Pipe for taking articles from both the US and UK sites of the Business Intelligence Network and filtering them on a user-specified topic. You can mash it up here! So what? ...to business intelligence in the enterprise. This is a glimpse of the future for wiring the flow of enterprise data among employees, partners, suppliers, etc. A popular Pipes mash-up via web-service is next gen of that past-season adhoc spreadsheet via sneaker-net. Tuesday, 6 February 2007Sailing with WindwardI had coffee with Dave Thielen, founder and president of WindwardReports this morning. They have an ease-to-use reporting tool that extends Microsoft Word with XML tags, retrieving data from SQL or XML data sources. The company is a nice example of bootstrap entrepreneurship. They have sold their product to a nice assortment of large glamour companies, like Pfizer, EDS, Fidelity, JP Morgan, and the like. Dave recently scored big with a viral marketing campaign that delivered international attention. Dave contracted with two young artists - Luke Barats and Joe Bereta - to do a promotional video for YouTube. They are so funny! You just got to see Cubicle Wars and then read Dave's blog for his reflections. As high-tech folks, we have a lot to learn about packaging our complex ideas into simple and funny stories like this! Monday, 4 December 2006One Terabtye Disk for Under $500Well, I have concluded that it has finally happened! One TB for under $500. We all have been watching the ads for disk storage of the past year, seeing the prices go down and capacities increase. It was just a matter of time... Okay, this disk is not managed storage with UPS and RAID redundancy in a standard rack, but it is a whole terabyte. Do you realize how many libraries this datastore could contain? Here are the details: LaCie Big Disk 1TB, USB 2.0 Hard Drive, Model 300966U, 5.25" 1U External Hot-swappable, 7200RPM with 8MB cache, Data Transfer Rate of 480Mbps Maximum, 34MBps Sustained. And..... it is 1.7" Height x 6.7" Length x 10.6" Depth and weights just 5 pounds. At BUY.com the LaCie Big Disk is $458.52 with no tax and free shipping. For BI/DW professionals, what does this mean? It certainly give a whole new meaning for independent (personal) data marts. We can have some amazingly rich stores of business information. But, for what purpose? Can we maintain data consistency across the enterprise when we have terabytes walking the hallways and out the doors? Can we maintain data security across the enterprise? This raises familiar issues that will only get more intense. Monday, 13 November 2006BI and Second Life: Creative CollaborationThe WSJ today carried an article on Second Life. The focus was how creative media/marketing companies were using this 3D environment for creative collaboration. The London-based ad agency Bartle Bogle Hegarty had a video produced on YouTube by the Electric Sheep Company (a system integrator specializing in SecondLife). Technorati Tags: Second Life, Business Intelligence, creative collaboration,
Friday, 3 November 2006BI and SecondLifeHave you heard of SecondLife.com? Unless you are into multiple-player video games, you probably have not. SecondLife is not a video game in the usual sense. It is a virtual reality environment where over a million people interact by roaming around in their custom avatars, conversing with other avatars, building unusual things (that can be scripted with equally unusual behaviors) and buying/selling virtual items (like clothing, houses and land). There are usually ten thousand people doing these activities at any time. And, close to a million REAL dollars are being exchanged every day! This is not your normal video game. Take a quick read of the feature article in BusinessWeek. Why is SecondLife important to our BI/DW professional community? This technology for interactive virtual reality is maturing into several areas having potential benefits for corporate IT environments. Let me suggest a couple... First, SecondLife can be used as an educational and collaborative environment. See the efforts of the New Media Consortium that are pooling the resources and intellect of hundreds of universities. The secret sauce of SecondLife is that groups can shared and interact with complex objects (prims). I predict that enterprise architectures could be designed and coordinate within SecondLife as if they were large buildings of smart Lego blocks. Think of a system management environment where you walk in full 3D around system components, not in a physical sense, but in a logical one. Status is indicated by movement, colors, sounds, and clouds. Second, SecondLife is an ideal environment for data visualization. This would not be your normal table or chart, but a data forest where the information density is equivalent to that of a natural forest. In a few weeks, I will blog about the launch of NOAA Island for science outreach (which is headed by my son!). Are there any IT professionals who are also on SecondLife? Please email me or comment on this blog. I will give you a personal (but virtual) tour of NOAA Island. Wednesday, 11 October 2006What to do about Very Large Data Streams (VLDS)In Colin White's article on Enterprise Data Mountain, Part I, he opens Pandora’s Box of various data store components for the new architecture of enterprise data integration, whose objective is to create an integrated and consistent view of your enterprise. The architecture is service-oriented using an Enterprise Service Bus to link the components. Forget the old paradigm of OLTP-OLAP happily coexisting! It isn’t your father’s data warehouse! It’s a new ballgame, folks! But... Colin forgot an important component... More and more data comes to the enterprise as a stream. You say, "So what? Just batch the stream!" This is a BIG stream, as in megabytes per second coming from a virtual data store of many petabytes. This is a CONSTANTLY CHANGING stream in content and structure. Forget storing the data. Examine it on the fly and decide instantly what aspects are important to manage within the enterprise data mountain. This gives new meaning to the phrase "unstructured data". See the work of Jeff Jonas of IBM on perpetual analytics. Let’s label this new component as a Very Large Data Stream (VLDS), as a bow to the critical contributions of the Very Large Data Base community over the past twenty years. Is VLDS important? I would predict that such streams are and will be critical sources to the enterprise data mountain as it evolves of the coming decade. New media sources (established news channels, blogging, YouTube), near real-time satellite images, detailed weather data, point-of-sale data from all your stores plus samples of the rest of your industry, RFID data tracking your full supply chain globally, and on and on... That is Enterprise 2.0 of the future. What do you think? Crazy ideas or right on? Wednesday, 13 September 2006Fifty Years Ago TodayIt was fifty years ago today! You ask.. What was? At the IBM San Jose Lab, the RAMAC computer was unveiled. It was the first machine with magnetic disk storage. Before this introduction, data was stored on sequential magnetic tape or, worst, paper tape. It weighted over one ton, had 50 spinning platters, held 5MB of data, and cost $50,000. Trivial quiz: What does RAMAC stand for? Don't cheat... Do you know? If you know, comment on this blog. Win a free drink at the next conference, courtesy of yours truly. So, why the big fuss for Business Intelligence? Without the evolution of the random-access hard disk, BI would not have existed. And, I would probably be selling paper tape readers today! The whole concept of a single consistent view of business reality as embodied in the enterprise data warehouse would be a distance dream. Large enterprises would be totally ineffective in a global economy. And, globalisation would be confined to regions, at best. What is amazing about the past fifty years is the dramatic evolution of disk technology. As quoted by Dan Fost of the San Francisco Chronicle, Dave Wickersham, COO of Seagate, compared hard disks to automobiles. "A car in 1956 cost about $2,500, could hold five people, weighted a ton, and could go as fast as 100 mph. If the auto industry had kept pace with disk drives, a car today would cost less than $25, hold 160,000 people, weight half a pound, and travel up to 940 mph." Amazing... Thursday, 3 August 2006A Whole New (Global) World for RFIDAn article in eWeek caught my eye on Bartending-RFID Style. I am not sure whether the Bartending part or the RFID part was the attention getter. Or, was it the combination? In global supply chains, RFID is a technology that will play a critical role. RFID will enable the global management of millions of products from raw materials to manufacturing to distribution to consumption to retirement - all of which exchange hands with thousands of companies and millions of consumers. This is mind-boggling! I have thought of RFID as a static or passive identification of a specific item. Ping it, and you get its unique identification. However, this article paints a different picture of 'active-tag' RFID. This technology not only monitors the presence of each bottle, but it also monitors each time the bottle is poured, the tilt of the bottle, the duration of the pour, and over time figures out the bartender's pouring style to calculate the amount that is leaving the bottle. WOW! Now that's tracking... And, think of the applications in the global supply chain! Not only do you know whether an item is presence, but you know its condition. For expensive or perishable merchandise, active-tag RFID could be critical in reducing spoilage. Gentag is one of the company involved with this technology. An application that they have identified is the 'smart skin patch' that will monitor glucose, cardiac, UV, etc. Through a nearby cell phone, the readings can be managed centrally. |