A magical Excel trick for flattening data copied from a PivotTable report in tabular layout into usable format

Magical Trick for Data Flattening in Excel

Here is a scenario that I come across sometime ago when I received an Excel file with some apparently messy data for further processing to generate another report. The data in the file appear similar to below screenshot. This data is self-explanatory. It’s the sales in units and dollars of each product under each product category made by each sales person. If you observe the data, for all the sales made by each sales person, name of the sales person is listed only once. Same goes with product category appearing only once for each product.

Now the challenge is to flatten the data in a way that for each product row, respective product category and sales person name should get repeated. Only then, it becomes consumable in downstream process for further preparation of another report. By the way, this is just some fictitious small amount of data. The file I received has more than a few thousands of records, which means it’s not feasible to perform this task manually.

By the way, this kind of data mess happens when you copy the data from a PivotTable report in Tabular Layout, with Repeat All Item Labels option not selected under Layout group of Design tab, +/- toggle button disabled under Show group of Analyze tab of PivotTable Tools ribbon; and then paste the values of such copied data to different location for the further sharing, like it was shared with me.

ExcelDataFlatteningSource

Here is a magical trick to complete this task accurately in just a couple of steps:

Step.1. Apply filter to the header row. Click on Salesperson field filter drop down. Deselect all items and then select only blanks. Hit OK.

The data looks as below after this step.

image

Step:2. After applying the filters, observe that the cell address of the first blank cell in Salesperson column is A4 (boxed in red). This means that the cell above A4 (i.e., cell A3) is the first non-blank cell. This gives us a logic that the value in A4 should have been the same as the value in cell A3. So, just enter the formula =A3 in cell A4. The same logic applies for all the other blank cells too. So, copy the formula to the rest of the blank cells. After this, clear the applied filters from salesperson column header. That works the MAGIC..!! This completely fills up all the blank cells in the Salesperson column accurately as in the below screenshot:

image

Repeat these two steps for Category column as well. The magical trick works there too..!!

For more such tricks follow the blog. Hit the likes if you like the trick. Happy Learning..!!

Advertisements

Calculation of metrics at database level or reporting level..? which is the best way..?

I just came across a same old situation of deciding where to do the calculation of certain metrics like percentages ?

Here is my take on it with a fictitious example –

Scenario 1:  Calculating it at database level and preserving it in the database:

see Fig.:1(Associate Level Sales) showing a table with sales made by a few sales executives in the years 2011 and 2012. Assuming that the percentage increase or decrease in the sales made by each of these executives is calculated at the database level and preserved in the database, the table appears as in the Fig.1 (Associate Level Sales).

Benifits: Now, since the calculations are materialized/preserved in the database, the performance of the reports containing this percentage calculation might be better, as the reports need to directly pull this calculation from the database instead of calculating on the fly.

Caveats: However in this scenario, the problems surfaces when these sales figures are expected to be aggregated at Manager level. See Fig.: 2(Manager Level Sales Pivot table) showing a pivot table built on top of the table data as in Fig.:1(Associate Level Sales) to arrive at the aggregated percentage change calculations. The percentage chanage calculations are wrong since they are doing the sum of the percentage changes at the associate levels to aggregate at the Manager level. whereas the calculations should have happenned as in the Fig.4.
In Fig.4 First manager level volumes are arrived at by adding up the associate level volumes. Percentage changes are then calculated on these manager level aggregated volumes.

PercentCalculationsDecisioning
PercentCalculationsDecisioning

Scenario 2:  Calculating it at the report level (e.g. pivot table in MS Excel or OBIEE or some other reporting tool) on the fly:

In this scenario, a pivot table is built as in scenario:1, but the percentag change calculation is done inside the pivot table as a calculated item using the formula (sales of 2012 – sales of 2011)/sales of 2011. see the results in Fig.: 3. Now these calculations match exactly with those of the Fig.: 4.

Benifits: Aggregations at higgher levels are calculated correctly.

Drawbacks: Since the percentage change calculation is not preserved in the database, it is calculated on the fly everytime the pivot table is refreshed resulting in report performance drop down.

Conclusion:  It’s just common sense that certain measures cannot be aggregated directly. Such measures are called Non-Additive measures. Additive measures should be preserved in database and Non-Additive measures should be calculated at the report level on the fly.

Hope you enjoyed reading this post. Please do share your comments, suggestions or questions, if any and help me improving this further..

Do rate this blog post on the top in terms of number of starts. The more you like it, the more the number of stars.. Happy time until my next blog post.. 🙂