Vol. 10, No. 1

January 2008

PQ Systems
 
Contents

When Excel isn't enough

Quality Quiz: With a video!

Data in everyday life

Six Sigma

Bytes and pieces

FYI: Current releases

 

Send Quality eLine
to a friend!

Just type in your friend's email below:

 

Sign up
If you received this newsletter from a friend and want your own subscription to Quality eLine, click below.

Subscribe to Quality eLine

 
Software

 

   

When Excel isn’t enough:
Charting your data

Why does one need a software program to generate charts, when spread sheet software itself can produce a variety of chart formats?

Indeed, this is a question that quality professionals explore frequently, summoning appropriate rationale to convince purchasing agents or supervisors that the expenditure is justified by the results.

One can, of course, develop charts from spreadsheet programs. One can also, as we all know from experience, use whatever is handy to pound a nail into a wall to hang a picture. (You mean you’ve never grabbed your stapler, or a paperweight, to facilitate a simple task such as this?)

Just as in the case of the stapler-as-hammer, using a spreadsheet to create charts that can provide information with customized charts, graphs, and diagrams, clearly works, but the question is—how well? Any master carpenter will emphasize the importance of using the right tool for a job, and the charting job is no exception.

Many customers have Excel data in a form similar to the following set, submitted as a query by a customer through the PQ Systems Knowledge Base.

Discharge Date

LOS

1/1/2003

3

1/2/2003

2

1/3/2003

4

1/4/2003

2

1/5/2003

1

etc.

 

Imagine that you have data like this covering an entire year or even more. You might be interested to know how the average length of stay is changing throughout the year. There are several ways to go about this using CHARTrunner.

One approach is to define a control chart which treats the LOS column as either a measurement (individuals chart) or a count (c-chart). The problem with this approach is that you may have several thousand rows of data. Each row will create a data point on the chart. Although you will be able to see the average LOS, the chart could be quite congested and it may not show you how LOS is changing over time.

Another approach is to decide on some time interval for grouping the data. For example, with this data you might look at the average LOS by week or by month. To do this you could use the Advanced row selection feature. This is found on the data definition tab of the chart definition form. This allows you to do grouping based on the DischargeDate column and select the grouping units - such as week, month, etc. This is a powerful feature but it does have limitations. For example, if you do advanced row grouping by month, and one of your months contains more than 100 entries the chart will not display. This is due to a subgroup size maximum of 100 for advanced row grouping.

Perhaps the best way to solve this problem is to use a custom SQL query and allow the query itself to do some aggregating of the data. This will be easier if you add a new column to your Excel sheet so that it looks like this:

DischargeDate

LOS

YearMonth

1/1/2003

3

2003 01

1/2/2003

2

2003 01

1/3/2003

4

2003 01

1/4/2003

2

2003 01

1/5/2003

1

2003 01

The YearMonth column can contain an Excel formula based on the DischargeDate column. Here is the formula used in the example:

    =YEAR(A2) &" " & TEXT(MONTH(A2),"00")

Next, we will create a new CHARTrunner chart that uses this Excel sheet as the data source. On the data definition tab, select Custom query as the source for data records - then click on the Edit query button and type in the following custom SQL query:

SELECT Min(YearMonth) as MinYrMonth, format(avg(los),"0.00") as AvgLos FROM [Sheet1$] GROUP BY YearMonth ORDER BY YearMonth

Two things about this query will help us: 1) we are using the GROUP BY clause to aggregate the data by month and 2) we are using the avg() function to calculate an average length of stay for each month. When this query is run the result data will look like this:

MinYrMonth

AvgLos

2003 01

2.89

2003 02

3.07

2003 03

3.18

2003 04

3.11

2003 05

3.12

2003 06

2.94

2003 07

3.35

2003 08

3.18

2003 09

3.19

2003 10

3.30

2003 11

3.09

2003 12

3.09

In CHARTrunner you would treat the MinYrMonth as an identifier and you would treat the AvgLos as a measurement. The resulting chart will contain 12 data points; one for each month found in the data.

Now that you know how the average LOS is changing from month to month - you may also want to know how many data points were used to compute each monthly average. A slight change to the query will give us this information as well. Here is the SQL statement:

SELECT Min(YearMonth) as MinYrMonth, format(avg(los),"0.00") as AvgLos , Count(YearMonth) as NumCases FROM [Sheet1$] GROUP BY YearMonth ORDER BY YearMonth

When this query is run the result data will look like this:

MinYrMonth

AvgLos

NumCases

2003 01

2.89

37

2003 02

3.07

30

2003 03

3.18

38

2003 04

3.11

35

2003 05

3.12

33

2003 06

2.94

33

2003 07

3.35

34

2003 08

3.18

33

2003 09

3.19

31

2003 10

3.30

33

2003 11

3.09

32

2003 12

3.09

33

If you treat the NumCases column as an identifier this information may be displayed above the chart in the identifier grid. This way you will know both the average LOS and the number of cases considered.

 

Copyright 2008 PQ Systems.
Please direct questions or problems regarding this web site to the Webmaster.