Have you met ROI and TCO? / J� conhece o ROI e o TCO?

This article is written in English and Portuguese
Este artigo est� escrito em Ingl�s e Portugu�s

English Version:

ROI and TCO are two well known IT buzzwords. They mean Return On Investment and Total Cost of Ownership (check the links for more technical descriptions). The first one tries to measure how much you earn or lose by investing on something (software product or solution in IT world). The second one tries to be a measure of the sum of costs you have when you decide to buy and use a software product or solution. For the technical staff these two measures are not the most important aspects. Usually we just like or dislike a product period!
But the decisions and choices usually aren't done by technical staff. Today the decisions are taken by people who have tight budgets and that need to show to corporate management that they are using the company's money properly. In other words, people who do and must care about TCO and ROI.
And many times we hear that Informix is nice technically (easy to work with, robust, scalable etc.), but that it's hard to convince higher management to use it (there are still ghosts from the past and FUD spread by competition).

Well, now we have two consulting firms studies precisely about these two topics.

The first comes from Forrester Consulting and analyses the costs, benefits and ROI of implementing Informix. Needless to say it concludes that you actually can save money while investing in Informix, and it explains why. The study was based on a real company.

The second come from ITG group and compares the TCO of Informix vs Microsoft SQL Server for mid sized companies. It's based on a survey with ten's of participants and the conclusions are, as you might expect, favorable to Informix (important to note that MS SQL Server has a reputation of being "economic", so we could let us think that it would be even better for other competitors)

The studies can be freely obtained from IBM website, and as usual you need to fill some data. I know, I know.... this is always annoying and we could arguably consider that this should be only a link to allow quick access to that, but honestly it will take one minute to fill the form, and you have the usual option to request that you won't be bothered (no emails, phone calls etc.)
Don't let it discourage you. The studies are well worth the effort. Their URLs are:

Finally, take notice that these studies used Informix 11.50. And we're getting closer to Panther which has lot's of improvements and some of them will certainly have impact on the running costs (less burden for the DBAs)


Vers�o Portuguesa:

ROI e TCO s�o duas palavras muito em voga no mundo das TI. Qurem dizer Return On Investment (Retorno do investimento) e Total Cost of Ownership (Custo total da posse) (verifique os links para explica��es mais t�cnicas - e ainda mais detalhadas na vers�o Inglesa). O primeiro tenta medir quanto se ganha ou perde ao investir em algo (produto ou solu��o de software no mundo das TI). O segundo tenta ser uma medida da soma de custos que teremos quando decidimos usar um produto ou solu��o de software. Para o pessoal t�cnico estas medidas n�o s�o os aspectos mais importantes. Normalmente n�s apenas gostamos ou n�o e ponto final!
Mas as decis�es e escolhas normalmente n�o s�o feitas pelo pessoal t�cnico. Hoje em dia as decis�es s�o tomadas por pessoas que t�m or�amentos apertados e que sentem a necessidade de mostrar � administra��o das empresas que est�o a usar o dinheiro da companhia criteriosamente. Por outras palavras, pessoas que se preocupam e devem preocupar com o TCO e ROI.
E muitas vezes ouvimos dizer que o Informix � muito bom tecnicamente (simples, f�cil de usar, fi�vel etc.), mas que � dif�cil convencer a gest�o a us�-lo (ainda existem fantasmas do passado e FUD espalhado pela concorr�ncia)-


Bom, mas agora temos dois estudos de duas empresas de consultoria precisamente sobre estes t�picos.

O primeiro vem da Forrester Consulting e analisa os custos, benef�cios e ROI de implementar Informix. Escusado ser� dizer que conclu� que na realidade estamos a poupar dinheiro ao investir em Informix, e explica porqu�. O estudo foi baseado num caso real.

O segundo vem da ITG Group e compara o TCO do Informix vs Microsoft SQL Server para m�dias empresas. � baseado num inqu�rito com dezenas de participantes e as conclus�es s�o, como seria de esperar, favor�veis ao Informix (� importante notar que o MS SQL Server tem a reputa��o de ser "econ�mico", por isso poder�amos deixar-nos pensar que seria ainda mais favor�vel comparando com outros concorrentes)

Os estudos podem ser obtidos gratuitamente no website da IBM, e como � habitual necessitamos de preencher alguns dados. Sim, eu sei, eu sei.... Isto � sempre irritante e � discut�vel de n�o deveria ser apenas uma quest�o de clicar num link para aceder aos estudos, mas honestamente n�o demora mais de um minuto a preencher o formul�rio e tem as op��es habituais para que n�o sejamos incomodados (nada de emails ou chamadas telef�nicas etc.)
N�o deixe que isso o desencorage. Os estudos merecem bem este pequeno esfor�o. Os seus endere�os s�o:

Finalmente, note-se que estes estudos incidiram no Informix 11.50. E estamos a aproximarmo-nos da vers�o Panther que trar� muitas melhorias e algumas delas ter�o certamente impacto nos custos de utiliza��o (menos carga para os DBAs)

Let's optimize / Vamos optimizar

This article is written in English and Portuguese
Este artigo est� escrito em Ingl�s e Portugu�s

English version.

In this article I'll show you a curious situation related to the Informix query optimizer. There are several reasons to do this. First, this is a curious behavior which by itself deserves a few lines. Another reason is that I truly admire the people who write this stuff... I did and I still do some programming, but my previous experience was related to business applications and currently it's mostly scripts, small utilities, stored procedures etc. I don't mean to offend anyone, but I honestly believe these two programming worlds are on completely different leagues. All this to say that a query optimizer is something admirable given the complexity of the issues it has to solve.

I also have an empiric view or impression regarding several query optimizers in the database market. This impression comes from my personal experience, my contact with professionals who work with other technologies and some readings... Basically I think one of our biggest competitors has an optimizer with inconstant behavior, difficult to understand and not very trustworthy. One of the other IBM relational databases, in this case DB2, has probably the most "intelligent" optimizer, but I have nearly no experience with it. As for Informix, I consider it not the most sophisticated (lot of improvements in V10 and V11, and more to come in Panther), but I really love it for being robust. Typically it just needs statistics. I think I only used or recommended directives two times since I work with Informix. And I believe I never personally hit an optimizer bug (and if you check the fix lists in each fixpack you'll see some of them there...)
To end the introduction I'd like to add that one of my usual tasks is query optimizing... So I did catch a few interesting situations. As I wrote above, typically an UPDATE statistics solves the problem. A few times I saw situations where I noticed optimizer limitations (things could work better if it could solve the query another way), or even engine limitations (the query plan is good, but the execution takes unnecessary time to complete). The following situation was brought to my attention by a customer who was looking around queries using a specific table which had a reasonable number of sequential scans.... Let's see the table schema first:

create table example_table
(
et_id serial not null ,
et_ts datetime year to second,
et_col1 smallint,
et_col2 integer,
et_var1 varchar(50),
et_var2 varchar(50),
et_var3 varchar(50),
et_var4 varchar(50),
et_var5 varchar(50),
primary key (et_id)
)
lock mode row;

create index ids_example_table_1 on example_table (et_timestamp) using btree in dbs1


And then two query plans they manage to obtain:


QUERY: (OPTIMIZATION TIMESTAMP: 08-25-2010 12:31:56)
------
SELECT et_ts, et_col2
FROM example_table WHERE et_var2='LessCommonValue'
AND et_col1 = 1 ORDER BY et_ts DESC


Estimated Cost: 5439
Estimated # of Rows Returned: 684
Temporary Files Required For: Order By

1) informix.example_table: SEQUENTIAL SCAN

Filters: (informix.example_table.et_var2 = 'LessCommonValue' AND informix.example_table.et_col1 = 1 )


So this was the query responsible for their sequential scans. Nothing special. The conditions are on non indexed fields. So no index is used. It includes an ORDER BY clause, so a temporary file to sort is needed (actually it can be done in memory, but the plan means that a sort operation has to be done).
So, nothing noticeable here... But let's check another query plan on the same table. In fact, the query is almost the same:


QUERY: (OPTIMIZATION TIMESTAMP: 08-25-2010 12:28:32)
------
SELECT et_ts, et_col2
FROM example_table WHERE et_var2='MoreCommonValue'
AND et_col1 = 1 ORDER BY et_ts DESC


Estimated Cost: 5915
Estimated # of Rows Returned: 2384

1) informix.example_table: INDEX PATH

Filters: (informix.example_table.et_var2 = 'MoreCommonValue' AND informix.example_table.et_col1 = 1 )

(1) Index Name: informix.ids_example_table_1
Index Keys: et_ts (Serial, fragments: ALL)


Ok... It's almost the same query, but this time we have an INDEX path.... Wait... Didn't I wrote that the conditions where not on indexed column(s)? Yes. Another difference: It does not need a temporary file for sort (meaning that it's not doing a sort operation). Also note that there are no filter applied on the index. So, in this case it's doing a completely different thing than on the previous situation. It reads the index (the whole index), accesses the rows (so it gets them already ordered) and then applies the filters discarding the rows that do not match. It does that because it wants to avoid the sort operation. But why does it change the behavior? You might have noticed that I already included a clue about this. To protect data privacy, I changed the table name, table fields and the query values. And in the first query I used "LessCommonValue" and on the second I used "MoreCommonValue". So as you might expect, the value on the second query is more frequent in the table than the value of the first query. How do we know that? First, by looking at the number of estimated rows returned (684 versus 2384). How does it predict this values? Because the table has statistics, and in particular column distribution. Here are them for the et_var2 column (taken with dbschema -hd):


Distribution for informix.example_table.et_var2

Constructed on 2010-08-25 12:19:04.31216

Medium Mode, 2.500000 Resolution, 0.950000 Confidence


--- DISTRIBUTION ---

( AnotherValue )


--- OVERFLOW ---

1: ( 4347, Value_1 )
2: ( 4231, Value_2 )
3: ( 3129, MoreCommonValue )
4: ( 2840, Value_4 )
5: ( 2405, Value_5 )
6: ( 3854, Value_6 )
7: ( 3941, Value_7 )
8: ( 2086, Value_8 )
9: ( 4086, Value_9 )
10: ( 4144, Value_10 )
11: ( 4086, Value_11 )
12: ( 2666, Value_12 )
13: ( 2869, Value_13 )
14: ( 3245, Value_14 )
15: ( 2811, Value_15 )
16: ( 3419, Value_16 )
17: ( 2637, Value_17 )
18: ( 3187, Value_18 )
19: ( 4347, Value_19 )
20: ( 898, LessCommonValue )
21: ( 4144, Value_20 )
22: ( 2260, Value_21 )
23: ( 3999, Value_22 )
24: ( 1797, Value_23 )
25: ( 2115, Value_24 )
26: ( 2173, Value_25 )
27: ( 4144, Value_26 )


If you're not used to look at this output I'll describe it. The distributions are showed as a list of "bins". Each bin, except the first one, contains 3 columns: The number of record it represents, the number of unique values within that bin, and the highest value within that bin. The above is not a good example because the "normal" bin values has only one (AnotherValue) and given it's the first one it means it's the "lowest" value in the column. Let's see an example from the dbschema description in the migration guide:


( 5)
1: ( 16, 7, 11)
2: ( 16, 6, 17)
3: ( 16, 8, 25)
4: ( 16, 8, 38)
5: ( 16, 7, 52)
6: ( 16, 8, 73)
7: ( 16, 12, 95)
8: ( 16, 12, 139)
9: ( 16, 11, 182)
10: ( 10, 5, 200)


Ok. So, looking at this we could see that the lowest value is 5, and that between 5 and 11 we have 16 values, 7 of them are unique. Between 11 and 17 we have 6 unique values etc.

Going back to our example we see an "OVERFLOW" section. This is created when there are highly repeated values that would skew the distributions. In these cases the repeated values are showed along the number of times they appear.

So, in our case, the values we're interested are "LessCommonValue" (898 times) and "MoreCommonValue" (3129 times). So with these evidences we can understand the decision to choose the INDEX PATH. When the engine gets to the end of the result set it's already ordered. And in this situation (more than 3000 rows) the order by would be much more expensive than for the other value (around 800 rows).

It's very arguable if the choice is effectively correct. But my point is that it cares... Meaning it goes deep in it's analysis, and that it takes into account many aspects, and is able to decide to use a query plan that it's not obvious (and choose it for good reasons).


Vers�o portuguesa:

Neste artigo vou mostrar uma situa��o curiosa relacionada com o optimizador de queries do Informix. H� v�rias raz�es para fazer isto. Primeiro porque � um comportamento interessante que por si s� merece umas linhas. Outra raz�o � que eu admiro verdadeiramente as pessoas que escrevem este componente.... Eu fiz e ainda fa�o alguma programa��o, mas a minha experi�ncia anterior estava relacionada com aplica��es de neg�cio e actualmente � essencialmente scripts, pequenos utilit�rios, stored procedures etc. N�o querendo ofender ningu�m, mas honestamente acredito que estes dois mundos de programa��o est�o em campeonatos completamente diferentes. Tudo isto para dizer que o optimizador � algo admir�vel dada a complexidade dos problemas que ele tem de resolver.

Tamb�m tenho uma vis�o emp�rica ou apenas uma impress�o relativamente a alguns optimizadores no mercado de bases de dados. Esta impress�o deriva da minha experi�ncia pessoal, do meu contacto com profissionais que trabalham com outras tecnologias e de alguma leitura... Basicamente penso que o nosso maior competidor tem um optimizador com um comportamento insconstante, dif�cil de entender e pouco confi�vel. Uma das outras bases de dados relacionais da IBM, DB2 neste caso, tem provavelmente o optimizador mais "inteligente", mas n�o tenho practicamente nenhuma experi�ncia com ele. No caso do Informix, considero que n�o � o mais sofisticado (muitas melhorias na vers�o 10 e 11, e mais no Panther), mas realmente adoro a sua robustez. Tipicamente apenas precisa de estat�sticas. Julgo que s� utilizei ou recomendei optimizer directives duas vezes desde que trabalho com Informix. E que me lembre nunca bati pessoalmente num bug do optimizador (embora baste consultar a lista de correc��es de cada fixpack para sabermos que eles existem).
Para terminar esta introdu��o gostaria de referir que uma das minhas tarefas habituais � a optimiza��o de queries. Por isso j� encontrei algumas situa��es interessantes. Como escrevi acima, tipicamente um UPDATE STATISTICS resolve o problema. Algumas vezes encontrei limita��es do optimizador (as coisas poderiam correr melhor se ele pudesse resolver as queries de outra maneira) e at� limita��es do motor (o plano de execu��o gerado � bom, mas a execu��o leva tempo desnecess�rio a correr). A situa��o que se segue chegou-me � aten��o atrav�s de um cliente que estava a analisar queries sobre uma tabela que estava a sofrer muitos sequential scans.... Vejamos a defini��o da tabela primeiro:
create table example_table
(
et_id serial not null ,
et_ts datetime year to second,
et_col1 smallint,
et_col2 integer,
et_var1 varchar(50),
et_var2 varchar(50),
et_var3 varchar(50),
et_var4 varchar(50),
et_var5 varchar(50),
primary key (et_id)
)
lock mode row;

create index ids_example_table_1 on example_table (et_timestamp) using btree in dbs1


E agora dois planos de execu��o que obtiveram:


QUERY: (OPTIMIZATION TIMESTAMP: 08-25-2010 12:31:56)
------
SELECT et_ts, et_col2
FROM example_table WHERE et_var2='ValorMenosComum'
AND et_col1 = 1 ORDER BY et_ts DESC


Estimated Cost: 5439
Estimated # of Rows Returned: 684
Temporary Files Required For: Order By

1) informix.example_table: SEQUENTIAL SCAN

Filters: (informix.example_table.et_var2 = 'ValorMenosComum' AND informix.example_table.et_col1 = 1 )


Esta era uma das queries respons�vel pelos sequential scans. Nada de especial. As condi��es da query incidem em colunas n�o indexadas. Portanto n�o utiliza nenhum ind�ce. Inclui uma cl�usula ORDER BY, por isso um necessita de "ficheiros" tempor�rios para ordena��o. (na verdade pode ser feito em mem�ria, mas o plano quer dizer que precisa de fazer uma opera��o de ordena��o).
Portanto nada a salientar aqui.... Mas vejamos o outro plano sobre a mesma tabela. Na verdade a query � muito semelhante:


QUERY: (OPTIMIZATION TIMESTAMP: 08-25-2010 12:28:32)
------
SELECT et_ts, et_col2
FROM example_table WHERE et_var2='ValorMaisComum'
AND et_col1 = 1 ORDER BY et_ts DESC


Estimated Cost: 5915
Estimated # of Rows Returned: 2384

1) informix.example_table: INDEX PATH

Filters: (informix.example_table.et_var2 = 'ValorMaisComum' AND informix.example_table.et_col1 = 1 )

(1) Index Name: informix.ids_example_table_1
Index Keys: et_ts (Serial, fragments: ALL)

Ok... � quase a mesma query, mas desta vez temos um INDEX PATH... Um momento!.... N�o escrevi que as condi��es n�o indiciam sobre colunas indexadas? Sim. Outra diferen�a: Neste caso n�o necessita de "temporary files" (o que significa que n�o est� a fazer ordena��o). Note-se ainda que n�o existe nenhum filtro aplicado ao ind�ce. Portanto, neste caso est� a fazer algo completamente diferente da situa��o anterior. L� o ind�ce (todo o ind�ce), acede �s linhas (assim obt�m-nas j� ordenadas) e depois aplica os filtros descartando as linhas que n�o verificam as condi��es. Faz isto porque pretende evitar a opera��o de ordena��o. Mas porque � que muda o comportamento? Poder� ter reparado que j� inclui uma pista sobre isto. Para proteger a privacidade dos dados mudei o nome da tabela, dos campos e dos valores das queries. E no primeiro caso utilizei "ValorMenosComum" e no segundo usei "ValorMaisComum". Portanto como se pode esperar, o valor da segunda query � mais frequente na tabela que o valor usado na primeira query. Como sabemos isso? Primeiro olhando para o valor estimado de n�mero de linhas retornadas pela query (684 vs 2384). Como � que o optimizador prev� estes valores? Atrav�s das estat�sticas da tabela, em particular as distribui��es de cada coluna. Aqui est�o as mesmas para a coluna et_var2 (obtidas com dbschema -hd):

Distribution for informix.example_table.et_var2

Constructed on 2010-08-25 12:19:04.31216

Medium Mode, 2.500000 Resolution, 0.950000 Confidence


--- DISTRIBUTION ---

( OutroValor )


--- OVERFLOW ---

1: ( 4347, Value_1 )
2: ( 4231, Value_2 )
3: ( 3129, ValorMaisComum )
4: ( 2840, Value_4 )
5: ( 2405, Value_5 )
6: ( 3854, Value_6 )
7: ( 3941, Value_7 )
8: ( 2086, Value_8 )
9: ( 4086, Value_9 )
10: ( 4144, Value_10 )
11: ( 4086, Value_11 )
12: ( 2666, Value_12 )
13: ( 2869, Value_13 )
14: ( 3245, Value_14 )
15: ( 2811, Value_15 )
16: ( 3419, Value_16 )
17: ( 2637, Value_17 )
18: ( 3187, Value_18 )
19: ( 4347, Value_19 )
20: ( 898, ValorMenosComum )
21: ( 4144, Value_20 )
22: ( 2260, Value_21 )
23: ( 3999, Value_22 )
24: ( 1797, Value_23 )
25: ( 2115, Value_24 )
26: ( 2173, Value_25 )
27: ( 4144, Value_26 )


Para o caso de n�o estar habituado a analisar esta informa��o vou fazer uma pequena descri��o da mesma. As distribui��es s�o mostradas como uma lista de "cestos". Cada cesto, excepto o primeiro, cont�m 3 colunas: O n�mero de registo que representa, o n�mero de valores �nicos nesse cesto e o valor mais "alto" nesse cesto. O que est� acima n�o � um bom exemplo porque a sec��o de "cestos" normal s� tem um valor (OutroValor), e dado que � o primeiro apenas tem o valor, o que significa que � o valor mais "baixo" da coluna. Vejamos um exemplo retirado da descri�ao do utilit�rio dbschema no manual de migra��o:


( 5)
1: ( 16, 7, 11)
2: ( 16, 6, 17)
3: ( 16, 8, 25)
4: ( 16, 8, 38)
5: ( 16, 7, 52)
6: ( 16, 8, 73)
7: ( 16, 12, 95)
8: ( 16, 12, 139)
9: ( 16, 11, 182)
10: ( 10, 5, 200)


Portanto olhando para isto podemos ver que o valor mais baixo � 5, que entre 5 e 11 temos 16 valores, 7 dos quais s�o �nicos. Entre 11 e 17 temos 6 valores �nicos etc.

Voltando � nossa situa��o concreta vemos uma sec��o de "OVERFLOW". Esta sec��o � criada quando h� valores altamente repetidos que iriam deturpar ou desequilibrar as distribui��es. Nestes casos os valores repetidos s�o mostrados lado a lado com o n�mero de vezes que aparecem.

Assim, no nosso caso, estamos interessados nos valores "ValorMenosComum" (898 vezes) e "ValorMaisComum" (3129 vezes). Portanto, com estas evid�ncias podemos perceber a decis�o de escolher o INDEX PATH. Quando o motor chega ao final do conjunto de resultados o mesmo j� est� ordenado. E nesta situa��o (mais de 3000 linhas) a ordena��o consumiria mais recursos qeu para a outra situa��o em an�lise (cerca de 800 linhas).


Pode ser muito discutivel se a decis�o � efectivamente correcta. Mas o que quero salientar � que o optimizador se preocupa.... Ou seja, a an�lise que faz das condi��es e op��es � profunda e � capaz de escolher um plano de execu��o que n�o � �bvio (e escolh�-lo por boas raz�es).