Saturday, August 21, 2010

Industry Data Warehouses

So what is an industry data warehouse? And why do we need them? Are they a threat to our confidential information and core competitive advantage? In very simple terms, industry data warehouses are created jointly by the firms in an industry group. They enable a better competitive environment by providing a common “score card” for participating firms. Now you know the rules, you can find out who is winning and who is losing and the game statistics. The competitive game gets even more riveting. Given the right "fit", while corporate data warehouses have the potential of providing a competitive advantage (please note the emphasis on "fit") to the owner, industry data warehouses create cooperative advantage (What about coopetition instead of competition?) for the users in the industry. Therefore, it becomes necessary that industry data warehouses provide common industry definitions of performance and market measurements, enable comparison of relevant parameters between a firm and the industry, ensure privacy of the data contributed by member firms, and have sufficient number of firms from the industry to guarantee reliability as well as confidentiality of the information. Industry data warehouses will have users who are data contributors as well as those who are non-data contributors. The data warehouse must fulfill the needs of both these groups of users by providing firm-only, comparison between the firm and rest-of-the-industry, benchmarking, and aggregate industry-only type of queries.

Industry data warehouses for benchmarking have become common place. More than a decade back, these were just emerging. If you are interested in learning more about industry data warehouses and how to build them, below is a link to an article that I had written in 1999. Enjoy in your copious free time!!

Link to the Article on Industry Data Warehouses

Note: This is a scanned copy of my article in PDF format available from Google docs.

Thursday, August 19, 2010

An SQL Journey: From Non-Procedural to Procedural

The simplicity of non-procedural languages is really enticing. Non-procedural languages relate to describing the result instead of defining the process. For example, when I visit a restaurant I ask what I want instead of describing how my meal should be cooked. Most of the time, we are asking what we need, instead of describing how we want a particular result to be delivered. We leave the how to experts. So we let the cook decide how to cook the food, allow Google to figure out how to do our Internet search and leave it to our car mechanic to figure out how to fix the car. This allows development of deep expertise in each area. This is the reason we have financial experts, who don't know much about cars, doctors who don't know much about computers and computer engineers who don't know much about the pedagogy of the oppressed. To a certain extent object-oriented languages have been able to mimic this real-world paradigm by defining operations and hiding the "know-how" as encapsulated modules. But all object-oriented languages are still procedural languages, since the programmers use flow controls to define the events and actions.

A non-procedural language defines what is being expected, therefore non-procedural languages don't have flow control.

Let me give you an example:

SELECT FirstName, LastName
FROM STUDENTS_TABLE
WHERE TEACHER = "Peter Drucker"

Here I have defined the result that I'm expecting. Give me the first name and last name of students whose teacher is Peter Drucker. I can get more granular and add course too. For instance,

SELECT FirstName, LastName
FROM STUDENTS_TABLE
WHERE TEACHER = "Peter Drucker"
AND COURSE = "Management 301"

This is a purely non-procedural statement, since nowhere in the statement have I defined how the result should be obtained. Both these statements are valid SQL statements.

When SQL started out, it was based on solid mathematical underpinnings of relational algebra and it was truly non-procedural. However, there has been an unfortunate evolution of SQL. Slowly but surely, more and more procedural constructs have been added to SQL. For example, addition of a CASE statement in SQL-92 was definitely procedural. Vendors have added their own flow control. Sybase added "stored procedures". Rest of the vendors including Oracle (Of course, Microsoft had bought the Sybase source code) quickly followed the lead by adding their own procedures. I think Teradata tried to maintain the purity and elegance of non-procedural SQL for a long time. Teradata had its own vested interest. It was very hard to build a procedural language on a "massively parallel processing" (MPP) platform. Therefore, Teradata stayed true to non-procedural operations of relational algebra for quite some time.

Procedural additions to non-procedural SQL have caused a havoc in programming. Initially, it was impossible to debug procedural SQL due to lack of breakpoints. Companies wasted millions of hours of programming time trying to debug procedural SQL. Of course, I must say that debugging of original non-procedural SQL with multiple sub-queries and correlated sub-queries was equally painful. But technology is always seductive. Technology is the ultimate silver bullet to kill the corporate vampires. Therefore, procedural SQL or stored procedures were widely adopted. Under a newly emerging client/server computing architecture, you could write all your business rules in procedural SQL, store them on the server with the database and minimize the network latency when "client" computers made a request for data. This was the end of non-procedural SQL.

To a certain extent non-procedural SQL was doomed due to its one of its major weaknesses: Purity of logic. Its elegance was its failure. It was just too perfect. Good logic is neither common sense, nor intuitive. I can give you some examples, which you can test in your copious free time:

Example: There are a bunch of car manufacturers, who make multiple brands of cars. Each manufacturer can make one or more brands. Car customers purchase one or more cars each year. Pretty simple, eh! Now, can you write a single (not multi-pass) SQL statement to find all the customers who have purchased all the cars made by a certain manufacturer?

Since I'm having trouble going to sleep and I can't think of anything better, I've setup a simple Microsoft Access database for you.

Here is the download link:

Click here to download Car Database as a Zip file from Google docs

or

Click here to download Car Database as a Zip file
from Box.net

If you don't have Microsoft Access and want to use another database, I've added an Excel Spreadsheet with multiple worksheets corresponding to database tables. The database has five tables as follows: CAR, CUSTOMER, MANUFACTURER, CUSTOMER_CAR, MANUFACTURER_CAR.

This is a database in 3rd normal form, which means that data redundancy has been reduced to downright minimum. CAR table has CarName and CarID, CUSTOMER table has CustomerID, cFirstName and cLastName, MANUFACTURER table has ManufacturerID and ManufacturerName. Such tables are sometimes called lookup tables. Now CUSTOMER_CAR table establishes a relationship between customers and cars. It shows which customer purchased which brand of car in which year. CUSTOMER_CAR table has CustomerID, CarID and BuyYear. Similarly, MANUFACTURER_CAR table establishes relationship between manufacturers and car brands. It shows which manufacturer produces which brand of car. Therefore, MANUFACTURER_CAR table has ManufacturerID, CarID and ReleaseYear.

In order to understand why I called non-procedural SQL as too perfect, try to write a non-procedural single-pass SQL statement that will return

(a) first and last names of all customers who have purchased all the cars released in 1974

(b) first and last names of all customers who have purchased all cars manufactured by Chrysler

(c) first and last names of all customers who have purchased all cars manufactured by Ford

(d) first and last names of all customers who have purchased all cars manufactured either by Chrysler or by Ford - If you got to this point using Microsoft Access, you will know exactly why it has a weak SQL engine.

You can use sub-querries and correlated sub-queries but no procedural or multi-pass SQL.

I guess I'm about to fall asleep now. Therefore, I'll write the solutions in my blog next week.

Until then enjoy these brain teasers!

Saturday, August 7, 2010

Management Meta-Functions and Technology


Note: Demand Management in this blog refers to management of demand for resources within a firm. I want to clarify it at the outset, since all business organizations are on the supply-side from a macroeconomic point of view.

No business organization has a demand management function as a department entrusted with the responsibility of managing internal demand for resources. But this function is critical and strategic. In fact, it is a meta-function of all functional areas and it requires active involvement of executive management and the board.

Within any business organization, demand is the sum total of all current and outstanding requests for resources. Resources could be dollars or labor or technology and generally these resources are fungible. There are countless instances: We need to make two more acquisitions, spend more to enhance customer service, upgrade technology, roll out new products, grab a bigger market share, improve marketing of our new products, spend on improving product quality, give salary increases to hard working staff and so on.

Demand management is mostly the responsibility of senior management and this responsibility extends to the executive level and board room too. Demand is controlled and directed by the mission, strategy, goals, policies, governance processes, capital budgeting, spending limits, return on investment (ROI) targets, et cetera.

On the IT side, demand is controlled and aligned with strategic objectives through a technology governance process that includes, inter alia, "tough love" (sorry, Mr. Cruz Bustamante) prioritization, chargeback mechanisms, and reviews of business cases.

Since demand always exceeds capacity, management of demand is about making choices. Sometimes there is a fear about making choices and at other times macho managers think that they can do everything: expand the target customer segment and meet all their demands, reduce costs and improve quality. Making a choice involves giving up. Choice involves trade-offs. Choice means deciding what we won't do. Choice means saying no. Doing everything and keeping everyone happy seems so much easier. Demand management is left to the winds of organizational weather system - whether it is functional or dysfunctional.

At the strategic level weaknesses of demand management show up in compromises and insidious erosion of competitive advantage over a period of time. But in the short run middle management has to figure out how to bring the resources to meet the demand coming from various corporate projects and day-to-day operational priorities.

Supply management that requires provisioning and allocation of resources is broadly a responsibility of middle management. Middle management has to make sure that their staff are not drinking beer with their feet on the table. This last sentence is a joke but what I mean is that middle management needs to make sure that available staff and skills are effectively and efficiently allocated to critical activities. If demand management meta-function is not effective, then either middle management has to step up to control the demand or it will end up dropping one of the balls - no pun intended.

Finally, we come to the meta-function of capacity management. What is capacity? Capacity is the upper limit of the aggregate output of the firm. It is the maximum ability of a firm to get the work done. There are three key determinants of capacity:

1. Staffing
2. Best practices
3. Technology


Business organizations can expand their capacity by improving capacity utilization or by adding new capacity. Addition of new capacity is expensive, particularly when existing capacity has not been driven to achieve its maximum. All business organizations try to expand their capacity. Often this meta-function is recognized as doing more with less. Some of the firms do this by having their managers scream harder and louder. Others do this by burning the midnight and weekend oil. None of these methods are sustainable. Therefore, these are non-strategic. Strategic expansion of capacity involves improving overall fit among various functions and activities of the firm to optimize effort and minimize waste. Southwest Airlines and Wal-Mart are good examples. Strategic expansion of capacity involves recognizing one's strengths and uniqueness. Increases in productivity are invariably part of improvements in the "fit" among various activities of the firm.

You will notice that technology is only one of the variables in the expansion of capacity. Therefore, acquisition of technology to drive capacity expansion across the enterprise involves a lot more than mere implementation.

Technologists have to understand how they help organizations do more with less. At a given level of technology, when maximum capacity utilization and productivity have been achieved through the use of best practices, demand has to match supply. Most of the IT organizations have begun to understand it and they are establishing control processes to manage demand and supply.

In fact, very large organizations have to build internal economy of the firm in such a way that the resources are routed towards the most profitable strategic opportunities.

Each organization has to evolve its own method of building an internal economy for management of meta-functions of demand, supply, capacity, productivity and competition. For example, several advertising firms have multiple teams competing internally for the same contract. Some of the advertising firms, in fact, launch several competing projects for the same contract. On the other hand, within technology firms, internal competition takes the form of how to skin a cat. IT departments use project governance, activity-based costing, project management and various software development methodologies to manage these meta-functions to match demand and supply. When they don't, they drop the ball. Some commitments are left unfulfilled. Project time lines are extended. At the end of they day demand matches supply but it happens mostly in an unpredictable and ad hoc manner.

Thursday, August 5, 2010

Don't Enhance My Shopping Cart Now!

Muckety Muck Electronics, Inc. (MME) was running a discount Internet shopping website for electronic components. MME had a policy of not publishing the prices of their merchandise until the merchandise were added to the shopping cart. The marketing department of MME had found that the rate of impulse buy on their website was quite low. Several of their online customers on their first visit added components of interest to their cart to check the prices and then left the website. However, the return rate of such first time visitors was above 37% and almost 70% of them were making a purchase on their return visit.

The shopping cart of MME was initially designed in such a way that it automatically emptied all its contents at the end of the day. Since search for electronic components with close tolerances was time consuming, the marketing department estimated that if shopping carts were able to retain contents between visits, then sales were likely to increase by at least 5%.

A project was immediately initiated to retain merchandise in the shopping carts. Since MME had to meet challenging financial targets, a tight time-line was implemented. The programmers modified the shopping cart program. Half-way through the project, the procurement department communicated that the inventory and the cost of procurement was changing on a daily basis, therefore the availability and prices of shopping cart contents had to be updated too. This was included as a new requirement and all the testing scripts had to be revised. The impact on the time-line was avoided by having the project team work through the weekend.

Finally, updated shopping carts were released in production. Two months later, sales from return visits were down by 7%. A root cause analysis revealed that the website “listing” module was originally programmed to display only those items that were best bargains based on daily comparative pricing feed. Any item that was not a bargain was not listed. This strategy was decided by the founders. However, when the items were retained in the shopping cart, they often lost their bargain status. They were not listed but still retained in the shopping carts with updated prices. This disappointed the returning visitors, who left disgusting comments on the discussion boards.

I have a list of two dozen questions related to the issues and problems in this business case. This is a real business case but I've changed the industry and the situation to protect business privacy.