July 19, 2008

Why SaaS is Good for IT vs. Why SaaS Isn't Good for IT

I've heard this argument back and forth from a number of people, so it is worth commenting on, particularly as related to business intelligence systems. That is data warehouses and the like.

Here's the positive themes: cost savings, freeing up resources for other strategic projects, improved quality of BI information and improved distribution of BI information, improved coverage of business units.

Here's the negatives: risk of reduced budget, risk of career disruption/change.

The net is that IT folks have to view SaaS as "good" for them because (a) it is good for the business (b) it is inevitable.

Let me elaborate.

The basic argument for SaaS in general is just the Adam Smith economic division of labor argument, i.e., you should make use of services instead of being self-sufficient for the same reason you buy food at the supermarket instead of farming your own - it is more efficient for a specialized producer to do it than for individuals to each do something they do not have the specialized skills for. That efficiency is returned to us in lowered cost, and increased reliability of supply. For IT systems this argument is presented really well in The Big Switch, which I highly recommend.

Contrary-wise this implies that if you have computing needs for which services aren't available, then by all means you should do those in house. This goes all the way down to writing your own software if software isn't available to do what you need. This is where the "good" part of the SaaS trend for IT comes from. It's the ability to refocus efforts on things that are strategic, not commodity.

For SaaS BI, more specifically, well it is good for the business because it saves cost over doing it in-house, and in most cases we at Oco can do it far better than an in-house project would be able to do. The cost reduction can actually enable BI to be available for the first time at many businesses which could not afford to do it before on the price scales that larger enterprises have been paying.

However, for businesses that replace an in-house effort with SaaS BI, whether it is "good" for IT depends on what is done with the savings. If these are re-invested in other strategic IT projects then it creates ongoing career opportunities for IT folks. However, if those savings are just used to increase profits and/or pay dividends to shareholders, then this good for the business is at best neutral for IT folks working in the trenches, and possibly very negative. Change can be opportunity, but always presents challenges.

If a new IT technology is good for the business, then strategic IT departments have to find a way to embrace it and make it successful for the business. It's untenable for an IT department to push back on important technology for "turf" reasons alone.

The career plans of people that are vested today in in-house creation of BI/DW systems clearly need to evolve to match the new reality. E.g., if you are an IT leader who wants a successful DW/BI deployment on your resume as part of your career plans, well you have to consider whether it would look better to have a SaaS deployment of BI on your resume instead. A SaaS BI deployment should save your organization lots of money, so perhaps you would also be able to feature the other projects you can now afford to do with the budget savings.

If you are a data analyst, then by definition your company already sees the importance of having data analysts, and having a SaaS BI deployment should just let you move up the value chain to deliver the BI information that goes beyond what you can get from your Oco system, or which explores the directions for enhancements of your SaaS-deployed BI solution. 

If you are an IT worker in the trenches, such as a DBA or systems administrator hoping to work on a DW/BI deployment, then there certainly is a risk of a disruption of your plans due to SaaS. I'm not sure I can sweeten this for you. As IT switches over to services and away from in-house deployments, then your job and role is going to change. You really do have to read The Big Switch, and should consider looking for employment on the services side, that is with the service providers. If you stick within your existing business/employer, you should look for something you can work on that is more strategic, i.e., that is related to a system that is very specific to your business, and not common across that business and many others. I can't find the origin of this quote, but the point bluntly is to "get strategic or get outsourced".

There's another way in which an inevitable change can be "good" for IT. The inevitability is basically saying that since it's coming, it's "good" to be a thought leader and embrace the change and get some advantage from it before everyone else has those advantages as well, at which point you are just playing catch-up. This argument is also known as "the best defense is a good offense".

The alternative is to fight the trend, rather than switch, but this is just burying one's head in the sand, to mix in another metaphor.

July 14, 2008

SaaS BI Going Mainstream: What Business Objects OnDemand + Oco Partnership Means to Me

Today Oco announced a partnership with Business Objects. I'm not going to reiterate what the announcement says since you can read that elsewhere. To me the key point here is that a major trusted BI supplier, Business Objects, is (and has been for a while now) heavily investing in OnDemand/SaaS/hosted deployment, and they've recognized that Oco's solution provides value for their customers through the integrated information content Oco creates accessed by way of the Business Objects OnDemand hosted environment.

To Oco this partnership is great because it legitimizes what we do and really reduces the shock-factor that our prospective customers see. Let's face it. Oco is an innovator (read "small" -- though growing!) company which by itself sells a solution which is delivered "shockingly" quickly (6 to 10 weeks), financial "risk free", delivered via a "radical" new deployment model (SaaS), and it provides only a UI targeted at the business-user. This is all pretty unfamiliar stuff for a conservative BI customer. Keep in mind that the track record on Data Warehouse/BI projects is historically pretty poor - hence many people have become conservative about them.

Business Objects changes the balance considerably: You replace the small company risk-factor with a major trusted supplier like Business Objects. You replace this radical SaaS deployment model with the fact that Business Objects OnDemand represents this SaaS trend going mainstream. Finally, you get what I like to call "design headroom" that the broad suite of Business Objects tools offers which caters to the complete user community that traditionally consumes BI spanning from the business users (who depend on Crystal Reports and Xcelsius dashboards) to data analysts performing ad-hoc analysis (for example via Web Intelligence).

What was shocking, new, and risky, is now a pretty safe bet, and customers can focus on what the new value is that the solution provides, not on all the seemingly risky new-ness of the way it is created and delivered.

My thanks go out to my colleagues at Business Objects who examined Oco, saw the value here, and worked to add us into their partnership sphere. Everyone at Oco and myself will be working hard to make sure we deliver great solutions to our joint customers.

 

July 11, 2008

DFDL = Data Format Description Language - The syntax of data

I chair a standards committee/workgroup within an organization called the Open Grid Forum. The workgroup is on something called Data Format Description Language, or DFDL, which you can pronounce "Daffodil" if you want.

DFDL is about facilitating data interchange, critical to most computing, but to data integration for BI applications in particular.

Many people ask why DFDL is needed in an era where there are so many standard data formats available (e.g., why not just use XML?). There are a number of social phenomena in the way software is developed which have lead to the current situation where DFDL is needed to standardize description of diverse data formats.

First, programs are very often written speculatively, that is, without any advance understanding of how important they will become. Appropriately given this situation, little effort is expended on data formats since it remains easier to program the I/O in the most straightforward way possible given the programming tools in use. Even something as simple as using an XML-based data format is harder than simply using the native I/O libraries of a programming language.

At some point however, it is realized that the program is important because either lots of people are using it, or it has become important for business or organizational needs to start using it in larger scale deployments. At that point it is often too late to go back and change the data formats. For example, there may be real or perceived business costs to delaying a deployment of a program for a rewrite just to change the data formats, particularly if such rewriting will reduce performance of the program and increase costs of deployment. (It takes longer to program, but at least it's slower when you are done ;-)

Additionally, the need for data format standardization for interchange with other software may not even be clear at the point where a program first becomes 'important'. Eventually, however, the need for data interchange with the program becomes apparent. At that point, you look back at the data format and maybe it's not too complex yet. So you don't re-engineer it. But add a year or so of evolution of the software with the attendant changes here and there to the data formats and suddenly you have a real problem.

The above phenomena are not something that is going away any time soon. There are of course efforts to much more smoothly integrate standardized data format handling into programming languages. But it is very unclear whether these will catch on, and there is, regardless, a role for DFDL since it allows after-the-fact description of a data format.

DFDL is also needed for performance reasons. At the hairy edge of computing people are always trying to process ever more data to gain some competitive advantage. At this edge, the performance penalty from using verbose data formats like XML can become very burdensome.

Lastly, there's this problem with data format debugging. I will elaborate a bit here because this is a hidden tax on every integration project where there is data in files. I get emails from field engineers trying to help customers with data format problems fairly often. Here's a typical one:

Problem: EBCDIC file, no Cobol file definition. Customer is telling me that it contains 5 fields.  Also tells me that the first 3 fields are unsigned, fourth field is Packed, 5th field is 'EBCDIC' (which makes no sense). Based on this a row looks like this:
            
            149,13568,0,4,"UUNR LEA "
            
Customer says the row should look like this:
            
            95,35,48,222038109,"UUNR LEA IN"

Here's a couple of records dumped out hexadecimal:

0095350000000000000048222038109fe4e4d5d940d3c5c140c9d540000f
0095350000000951500501088322722fd2e4d9e3e94b4bf0d9f64b4b002f

Figuring out problems like this is what I call data archeology. Most people can't imagine that this sort of work is still needed routinely in data integration projects. This is truly geeky stuff, I mean hex dumps?!

A tool, a data format debugger, is badly needed to address this.

Looking at the above hex, using some brain heuristics, coupled with the customer's information provided allows me to chop up the hex like so:

0095 35 0000000000000048 222038109f e4e4d5d940d3c5c140c9d540 000f
0095 35 0000000951500501 088322722f d2e4d9e3e94b4bf0d9f64b4b 002f

Analysis: Looks like the "unsigned" fields are also packed. I.e., what they meant by "unsigned" was "unsigned packed", not unsigned base 2 binary. Seems like they also didn't tell me about all the fields. There's one more field at the end.
            
            unsigned packed 4 digits
            Unsigned packed 2 digits
            Unsigned packed 17 digits
            Unsigned packed 9 digits (the final "F" nibble is padding)
            Fixed length 12 character string ebcdic character set
            unsigned packed 3 digits (the final "F" nibble is padding)
            
It's also possible that the fields with the "F" nibbles are signed packed numbers but from a non-IBM mainframe Cobol compiler. (IBM standard Cobol uses C and D as signs, and F as padding for odd-length unsigned packed decimal I believe.)

I hope you are now as sick of this example as I am... phew!

Now, pulling up out of those gritty details,.....the depth of experience and knowledge needed to be able to pull off this kind of data archeology is pretty expensive to hire, and it is a huge tax on our industry every time this sort of thing comes up, which is far more often than people imagine.

Unfortunately, data format issues are not a sexy, high-value adding feature for a software product. Yes, these issues are costly for customers. But no customer says "Gee I'm happy to pay another bunch of dollars to have a data format debugger." It's not in their top 10 new feature requests because it is hard for them to imagine the huge cost that is repeatedly sunk into mundane data archeology in a single major data warehousing or BI initiative.

Most people who are naive to this issue figure that each of the major vendors have a proprietary data format language in their product portfolios. Reality is that the larger vendors have 1/2 dozen or more. The difficulty is that the lack of a standard for data format description means that all these separate software bases are being maintained to do roughly the same thing and no one product team can afford to invest in a really good data format debugger since it works only with that one product's proprietary data format system. As a result, nobody gets a data format debugger that is really any good, and we all pay the tax again and again.

A standard for DFDL fixes this. A data format debugger pays dividends across multiple products if they are using a common DFDL and so the investment to make one is worthwhile eventhough it's not a sexy new feature that customers are asking for.

I did overstate above when I said no customers ask for this. There are some educated consumers who ask for DFDL. I started work on DFDL when a customer made it clear that they would buy more software from me if the data format description language it used was standard, not proprietary. This is the best reason to promote a standard. Not in order to commoditize some software but to enable increased usage. A standard DFDL would allow them to use it to interoperate across software from multiple vendors. They could invest safely in tools that leveraged this common data format description language. At the time we were dealing with interoperation with SAS, which has its own data format descriptions. We were struggling with moving data from my software (this was Ascential at the time) to and from SAS (as in The SAS Institute) applications. When you have a record format with 700+ fields in it, and there's an error somewhere in the middle, debugging it is pretty hard. Eventually we found the problem, but the customer learned that the effort involved in using these pieces of software together and dealing with their proprietary data format languages was not worth it. If a standard data format language was shared then they could use these software packages for more things, and use them together easily.

You can learn more about the DFDL standard here.

June 27, 2008

Solutions vs. Tools - Round 1, and BI for the Equipment Servicing Industry

A theme I will undoubtedly hit over and over is this difference between solutions and tools in Business Intelligence. To some, the BI market is entirely populated by tools vendors. If you say you sell a BI solution people will quite literally ask back "what kind of a tool is it?".

Of course the difference between a tool and a solution depends on your perspective. If you are a data analyst, a solution to your particular problem may be a superior tool to help you analyze data in flexible ways. However, if you are a business person looking for solutions to business problems, then a data analysis tool is clearly not a solution. It might be part of one, but much additional business knowledge must be used to turn it into a business solution. So when I use the term 'solution' I generally mean 'business solution' to a business problem for a business person to use, and specifically NOT a tool for a data analyst to use.

At Oco, this is a huge part of what we do. We add business value by facilitating cross-business-unit agreement on common definitions for business concepts, so that our reporting solution really addresses the needs of the business and is usable by business people. Our solution contains quite specialized knowledge about the business problem areas we address.

Register now and get a free White PaperAs an example of this, imagine you work for an equipment manufacturer. The equipment needs servicing, repair, preventative maintenance, and so forth. An important part of your business is revenue from this service business. Optimizing this business for profit, customer satisfaction and such is, naturally, important. So, you don't need a "BI tool". Rather, what you need is a system for understanding and optimizing the way you run this services business. To this point, Oco (with Aberdeen and Qualcomm) is having a webinar titled: Smart Business Intelligence Solutions For Optimizing Your Service Operations. Attend if you are in the equipment/service industry, or really want to understand the difference between a solution and a tool for BI. Once you start to appreciate the value of this embedded business value in a BI solution, you start to use the term "BI tool" rather pejoratively, as in "XYZ product is interesting, but it's just a tool".

June 26, 2008

Your Computation Mileage May Vary

Computation may be the first energy using technology where there is no natural limit on how much people will use. E.g., consider cars - I might want a giant SUV, but I don't want one 10 times bigger than that. There's a practical limit here. But in computation? Why wouldn't I want a massive supercomputer at my beck and call 24/7? If it makes my life a little more convenient then why not? It's not like I have to haul it around with me or park it, and it can be literally anywhere in the world and I don't care. Now, in the era of big internet search engines, a huge number of computers are at my beck and call any time I want. I type a query in the box and perhaps thousands of computers jump in a data center somewhere. Also, many computers have been scouring the internet preparing the data they are jumping to send back to me when I query them.

These computers and the energy they consume weren't there a few short years ago. Now they're turned on and busy processing the queries we all throw at them. So, my personal energy consumption due to this computation has gone up quite a bit. There's a carbon footprint here. Normally we don't think about the energy consumption of these computers. What is the carbon footprint of my internet searching anyway?

Well, let's say the thousands of computers that jump when I query the internet are racked up. A standard computer rack can hold from 20 to 80 computers depending on model, and can consume 20Kwatts of power. Cooling eats up another 50% of that power, so 30Kwatts total per rack. When I say thousands of computers jump for me, that means order hundreds of racks. So let's say 100 racks of computers at 30Kwatts per rack.

That's 3+ Megawatts.

Just how much energy is that? Is that a lot? It sounds like a lot.

Well most people drive cars and understand energy in terms of gallons of gasoline....so let's convert 3 megawatts into gallons of gasoline burned efficiently, per unit of time. I drive an efficient car (not the mega SUV I mentioned above), so I use up about 10 gallons of gas a week. I use my favorite internet search engine, perhaps 10 times a day. Let's say my queries take 1 second each for the thousands of computers that jump whenever I search. So that's 10 x 7 days x 1 seconds = 70 seconds. So 70 seconds of 3Megawatts is 210 Megawatt seconds. A gallon of gas contains roughly 35 kilowatt hours of energy which is 126,000 kilowatt seconds, or 126 megawatt seconds. So my google searches for a week burn up about 1.666 gallons of gas a week. This is about 1/6 of my regular fuel consumption per week.

Furthermore, it costs, at the current $4/gallon, around $6.40.

Now, the above estimates of compute time are probably overly generous. I just did a google search, and they say it took 0.32 seconds to execute, quite a bit less than my estimate of 1 second above. Also, I don't know if that's the total computation time, or the parallel computation time on some huge cluster of computers. If it's just one computer spending that 0.32 seconds, then the energy is of course much less.

But that's beside the point. I want really high quality information available at my fingertips. If thousands of computers did have to jump to get it for me at my every whim, then I don't really care.... there's no natural limit to what I want here. It' s not like driving a giant SUV where a much bigger one wouldn't be interesting. Why would I care how big the computer that serves me has to be. It's not like it has to fit in my house.

That is, until the price tag for the energy comes into play. And the carbon footprint of that energy if it comes from fossil fuel.

June 25, 2008

Amazon, Microsoft, and Google Reinvent the Database - Did it Need Reinventing?

Amazon has introduced a simple database, Google's app engine includes a simplified database. Microsoft is beta testing SQL Server Data Services, which despite the name "SQL Server" is anything but. It's got a new Linq-based query system.

Haven't these guys ever heard of a little something called SQL, which is a standard? It feels more like Google wants to lock you in with their Google-specific data API, and ditto for Amazon's DB and Microsoft's new baby.

This reminds me of the browser wars all over again. Different players, same slug-fest. Nobody is taking the interests of the developer who wants portability to heart. I want to deploy an application and I want to be able to use either Google, Microsoft, or Amazon because my application is written to some standards. These parties are behaving as if that is exactly what they don't want you to be able to do.

At least on the Amazon cloud I can run whatever Linux system instance I want, so I can run a database of my choosing. Google provides no such flexibility (yet).

All this database innovation would be fine if it really delivered value, but I really fail to see the advantages of these new systems over say, a SQL subset, which could be simplified substantially from the broader SQL standard. Not for a real system anyway.

June 24, 2008

Open Source, GPL and SaaS

So, there are lots of reasons why SaaS rocks as a way of delivering functionality to customers. One of them is the ability to make good use of GPL'ed software.

At a regular closed-source software company, you can't include any software in your products that you get from the web which carries the GPL (Gnu Public License), because this license requires all "derived works" to also use the GPL and be open source. This is why the GPL is often called the "Gnu public virus", the license contaminates the software it touches.

What's worse, there's a more limited LGPL, which is supposed to allow linking to other software without having the GPL provisions apply to the whole linked "work" of software. Though this would seem to fix the problem, many software companies don't allow use of LGPL software either. (Reasons here seem to be more paranoia about patent restrictions, than real, but lawyers are lawyers.)

Here's the trick: The GPL is troublesome only for companies that distribute software. Here's where the SaaS idea wins. We don't distribute our software. We're a service provider. We have no problem at all basing our service on software built from GPL'ed pieces. We're in perfect compliance with the license.

The GPL recently was updated from v2 to v3, and there was worry that v3 would close what is called the "service loophole". This didn't happen. GPL v3 still allows the service loophole. They decided against closing it.

Net result: there's a huge and growing body of open source software, and the largest part of it carries the GPL license. As a SaaS provider we can exploit this base of code to do what we want without fear.

This is a big advantage over "ordinary" software.

June 23, 2008

SaaS vs. On-Premise Software for the Enterprise - Round 1

The SaaS delivery model is quite compelling to me. I founded a dot-com company back in 1999 which was what we called back then an “application service provider”, but the acronym ASP now seems to mean “average sales price”, and SaaS is what we call this deployment strategy today. Anyway,  we delivered an information service to Internet portals which they used to enhance their search results. It was here that I discovered the sheer joy of being able to select and standardize on a processing platform independent of the customer’s choices. I say ‘joy’ because on-premise software in the enterprise space is really difficult to create. I have to digress here a bit to tell you why.

 

Today, there is not much difference between computers from the various manufacturers. The reason one company is an HP shop and another a Sun shop probably has little to do with the technology from those vendors and much to do with which vendor has a superior sales rep in the territory covering that company. This is not cynicism. It's just that computing platforms are largely commodity these days. There’s no value being added here by these platform selections, but it implies that an on-premise software offering from any vendor has to run on both Sun and HP platforms, and many others.

In enterprise software, this diversity of platforms creates a tremendous overhead for on-premise software companies. Consider a standard enterprise integration software package. It must run on Unix variants from IBM, Sun, HP, also Linux, Windows. It might also have to run on mainframes, mid-tiers (AS400 and such). It must support many versions of the OS for each, it must provide reliable connectivity to Oracle, DB2, Sybase, and many others, and many versions of each, not to mention every ERP platform (+versions thereof)…. I think you see that the combinatorics of this make QA quite troubling.

It gets worse.

There’s this aspect where every important feature of enterprise software has to be available but also pluggable. E.g., when selling a software package to a customer they will ask: “Does it have high-availability (HA) features?” The answer here is supposed to be “yes”. But another customer will ask: “Does it work with the HA package I already bought and am using for these other systems I have?”, and the answer here is also supposed to be “yes”.  So HA has to be both built in, and pluggable. This is a nightmare for a software architect for enterprise software. Making a reliable software system where something as intimately configured as HA is pluggable is…. well, very challenging.  Scary really.

Now consider SaaS instead of software packages. The provider, e.g., me, that is, Oco, selects an approach to scalability, HA, etc. we select a standard platform. We get big economy of scale from this. We have only to support a small number of versions of anything, and that’s just so we retain some internal flexibility in what we’re doing to provide reliable and performant services for our customers. This frees us up to concentrate on adding value by adding the features our customers need. QA is vastly simplified because the combinatoric explosion is gone. Involving our customers in the data-aspects of the QA is also easy in that new and old versions of an Oco service can be simultaneously up and usable. Customers can evaluate/test a new one while continuing to use an older one. There’s no “forklift” upgrades i.e., where the new software version supercedes the old one, and upgrades it and there is no going back, no agony over whether installing a new version will disrupt a prior functioning system, etc.

To a software developer, working on the features is where it's at. It's more satisfying to deliver more features to customers sooner. It's liberating and energizing. All these lead to good psychology for keeping the developers motivated, and that adds up to satisified customers.

Now if you consider the arguments above, appliances (product consists of hardware + software) can also a reasonable approach, so long as maintenance of the system isn't a burden placed on the customer. But that is a subject for a future posting.

June 18, 2008

Business Integration - Dare I say "Semantic" Integration?

There's a number of fairly deep insights about what Oco does that explain a number of issues in our industry. There's all these stats about how 1/2 or more of BI/DW (data warehouse) projects fail. There's a reason for this: there's a big gap between the IT systems and staff who do BI/DW projects, and the business people who must use the data to make business decisions. The gap runs two ways. The IT folks don't know enough about the business, and the business people don't know enough about the IT systems.

A big part of the value Oco provides is our structured methodology for figuring out the business metrics needed, and our technology for finding them in the data. This allows us to close this gap.

What Oco does is what I call semantic integration.

Now "semantics" is my 2nd least favorite word. My least favorite is "ontology", which is only used to obfuscate from what I can tell. I only use the word "semantics" to distinguish from "syntax", i.e., by semantics I mean "more meaning than just syntax". I hope this becomes clear below.

What most projects (the kind that often fail) do is syntactic integration. IT folks are told business people need more data. So they set out to do what they can do, which is to collect together all the data of the organization in one data warehouse; however, they lack the business knowledge to consolidate disparate information into common concepts. They can access and move the data all into one database, because that's a syntactic problem.

I live in Red Sox Nation (yes I live near Boston, yes we have been sports spoiled for a few years now. Guilty as charged!), so let me illustrate this syntax vs. semantics issue with an example from baseball.

Suppose I have a data file. In it is a bunch of records about baseball player statistics. One of the fields is named "AVG". It is an integer. This is very likely to be the batting average, and we all know that batting average is a number between 0 and 1, typically rounded to 3 decimal places. Batting a thousand is 1.000. Typically this kind of data would be stored as an integer, not a decimal.

Syntactically, I can access the AVG field, and put it in a database column of my data warehouse and I can even do validity checking to make sure no value is negative or above 1000. This is all low-level syntactic stuff.

Now, moving up to the business or "semantic" layer. A real baseball fan or manager, who understands the business, knows that nobody bats above 0.500, and here's an interesting thing,... in fact a batting average isn't even computed for a player until they've had at least 12 at-bats. This allows one to deal with missing data in the data set, or at least some of the missing data. So I hope you see how I needed to understand the data in a more powerful way. I have to get the the business meaning of the data in order to understand how to deal with simple issues like whether it's ok for this data field to be missing.

Ok if you are following my argument, but it goes deeper. A more savvy baseball manager knows that batting average isn't such a great statistic, that on-base percentage is actually far more predictive of whether players help win games. This is real business knowledge. So, if I am trying to evaluate players, I need to compute, from whatever data I have, the on-base percentage. Knowing this I can take data from disparate systems which contains a variety of low level raw stats, and compute the on-base percentage from them. I also need agreement across my team, or league or whatever, that this statistic is one we should all measure in the same way, and when we say "on-base percentage" we should all mean the same thing by that.

This is what I mean by semantic integration. Without this agreement that on-base percentage is the metric of merit, you are stuck at the syntax layer wondering why you don't have compatible data coming in from each of the source systems into the data warehouse. That is, the IT person who hasn't been told to compute on-base percentage doesn't compute it. The data analyst trying to use the DW doesn't find useful semantic metrics like on-base percentage in the data. Rather they find the raw ingredients from which it could be computed, but not consistently or easily. They can't do the needed reconciliations to allow on-based percentage to be computed easily because they're operating below that level of knowledge.

Now flip over to the role of the business department trying to use a data warehouse. The data analyst can't analyze the data because they have to do the semantic integration first. All the data is syntactically integrated, but it's not exactly set up to allow computing the statistic of interest. A very complex integration must happen in the queries on the data warehouse to get to something meaningful to the business. This is why data warehouses spawn data marts by the way, the data mart provides some of the mismatch. It is used to fill in some of this missing business semantics.

This ability to understand the semantic integration that is needed, to facilitate a discussion among the business people who need the data and extract from them what are these metrics and then use them to drive the lower level syntactic-level integration - this is truly what makes Oco work. This is a key part of what makes our solutions valuable.

My BIASes

Welcome readers.

This blog will contain my opinions, call them my biases if you want, on topics related to Business Intelligence, and Software as a Service.   

My acronym "BIAS" for "Business Intelligence as a Service" is very unofficially what Oco (www.oco-inc.com) does.  I say "unofficially" because I did not check with the VP Sales and Marketing before starting with this (Hi Anil), some other folks said "go for it", so here we are!

Lots of blogs I see are collections of snippets each roughly with content of "Check THIS out" containing links. You will definitely not get these sorts of items here. I will comment and link things when I have something useful to add.

Many people are curious about what makes SaaS such a good idea, and I have pretty strong biases/opinions about this subject. I am CTO (or CBO) at Oco because I'm a big believer in the importance of this evolution of computer technology. I will share the reasons why I decided to get into this specialty, and why I believe it will be successful and important in the marketplace. I think these reasons are relevant to people on the buy-side and the sell-side of this technology who need to understand its advantages. I believe many of these advantages are actually quite subtle and not yet well represented in cyberspace.

As for bias and non-bias: I considered posting this blog as some neutral sites where there are other bloggers on related topics. I decided not to go there because the blogs there have this fairness about them with respect to vendors, and the writers of them don't work for vendors, and so on. That is so not me! I am far too biased. I work at Oco because I'm passionate about what it does, the value that delivers, and I believe it is important. Yes, they pay me. I think everyone understands that.

That said, this blog is not a marketing vehicle. I take my reputation as a technologist very seriously, and I would never tell someone that what Oco has to sell is good for them to purchase unless I firmly believed this was truly going to make their life easier, solve a problem, or generally add value for them. So while my biases will be expressed here, you can be assured that my biases are for good technical reasons, or at least intuition that I hope is correct, and not just because it is what my company happens to sell.

So, that's enough of the caveats and "bias" puns.... 

Many people are looking at the SaaS trend today and want to understand how and when to engage with it, what it really means, why it's a good idea, etc. I've been in high-volume commercial data processing since 1995, and enterprise software during that entire period. I've seen the issues of enterprise software development (and it's achilles heel, QA, which will be a post "real soon now") and I strongly believe this hosted/cloud "as a Service" deployment paradigm is very clearly the way for companies like mine to deliver value to customers most effectively, and specifically for business intelligence applications to be created and deployed for businesses today.

Thanks for reading, and I hope you find subsequent posts useful. Please comment on them. Don't hesitate. Show your biases please. I'm convinced most people have good reasons for them.

My Photo