A magical Excel trick for flattening data copied from a PivotTable report in tabular layout into usable format

Magical Trick for Data Flattening in Excel

Here is a scenario that I come across sometime ago when I received an Excel file with some apparently messy data for further processing to generate another report. The data in the file appear similar to below screenshot. This data is self-explanatory. It’s the sales in units and dollars of each product under each product category made by each sales person. If you observe the data, for all the sales made by each sales person, name of the sales person is listed only once. Same goes with product category appearing only once for each product.

Now the challenge is to flatten the data in a way that for each product row, respective product category and sales person name should get repeated. Only then, it becomes consumable in downstream process for further preparation of another report. By the way, this is just some fictitious small amount of data. The file I received has more than a few thousands of records, which means it’s not feasible to perform this task manually.

By the way, this kind of data mess happens when you copy the data from a PivotTable report in Tabular Layout, with Repeat All Item Labels option not selected under Layout group of Design tab, +/- toggle button disabled under Show group of Analyze tab of PivotTable Tools ribbon; and then paste the values of such copied data to different location for the further sharing, like it was shared with me.

ExcelDataFlatteningSource

Here is a magical trick to complete this task accurately in just a couple of steps:

Step.1. Apply filter to the header row. Click on Salesperson field filter drop down. Deselect all items and then select only blanks. Hit OK.

The data looks as below after this step.

image

Step:2. After applying the filters, observe that the cell address of the first blank cell in Salesperson column is A4 (boxed in red). This means that the cell above A4 (i.e., cell A3) is the first non-blank cell. This gives us a logic that the value in A4 should have been the same as the value in cell A3. So, just enter the formula =A3 in cell A4. The same logic applies for all the other blank cells too. So, copy the formula to the rest of the blank cells. After this, clear the applied filters from salesperson column header. That works the MAGIC..!! This completely fills up all the blank cells in the Salesperson column accurately as in the below screenshot:

image

Repeat these two steps for Category column as well. The magical trick works there too..!!

For more such tricks follow the blog. Hit the likes if you like the trick. Happy Learning..!!

Advertisements

Flash Fill Trick in Excel 2016

ExcelFlashFill

Here is a very common exercise that you might have or may come across – You have full names of employees in a column and you are expected to extract last names from these full names into a separate column. In versions prior to Excel 2013, you need to use formulas and functions to accomplish this task. Below screenshot shows an example of how this is achieved and the formula used. Cells A3 to A7 have full names (First name followed by last name), cells B3 to B7 have formulas to extract only last name from the full name.

Extracting Last Name from Full Name using Excel Formula

Starting from Excel 2013, you don’t have be an expert in formulas and function to achieve this task, instead you have magical built-in tool called Flash Fill at your rescue. In our example, in cell B3, instead of typing any formula, manually type the last name from the full name contained in cell A3. Since cell A3 has the text – “Balakrishna Sappa”, the last name from this full name is “Sappa”. Now in the next cell, i.e., cell B4, just type first letter of the last name from the corresponding full name. In this example, since cell A4 has the text – “John Short”, the first letter in the last name is “S”. Soon as you type “S” here, you will see Excel suggesting you rest of the letters to complete your work. In our example, Excel is suggesting the complete text as “Sappa”. i.e., Excel is looking at the values present in the cells immediately above the active cell and accordingly suggesting you. This feature in Excel is called “Auto Complete”.

ExcelAutoFill

But evidently, this doesn’t serve our purpose. So, in the above step, instead of accepting Excel’s suggestion, continue typing the second letter in the last name. i.e., you must type “Sh” in cell B4. Now, Excel rightly suggests the last names for all the rest of the full names including the one in the active cell. See the below screenshot. Just hit enter and you are done with the task. No knowledge of formulas and functions needed to complete the task. This feature of Excel is called Flash Fill.

FlashFill1

Note: An important point to remember to make use of Flash Fill is that you must enter the complete first value manually and at least first letter in the second row value. In our example, we have manually entered the complete last name value in cell B3 and first two letters in the last name of the second row. This is because, these are the steps that enable Excel’s Flash Fill to work its intelligence.

The above screenshots are from Excel 2016.

Hope that saved or will save you time by large when you are confronted with this or similar seemingly intimidating tasks..!!

Happy Learning..!! Do leave your comments and hit the likes if you find this useful.

New function in Excel 2016 to display in a cell, the formula contained in another cell

New function in Excel 2016 to display in a cell, the formula contained in another cell

You must have come across a number of situations when you were trying to explain a formula that you have written in Excel and the result of the formula. This kind of situation basically demands both the formula expression and the formula result to be displayed simultaneously. For example, from the below screenshot, if you are trying to explain that the formula in cell G2 is =TODAY() and that this is a built-in function in Excel that returns today’s date, you might instruct to view the result in the cell (active cell in the below screenshot) and to view the formula contained in the active cell from the formula bar (highlighted in rounded box in below screenshot).

 

image

 

what if I you want to see both formula and the result of the formula side by side or one below the other and avoid viewing two extreme corners of the window? Excel offers a new function named FORMULATEXT to address this purpose. FORMULATEXT function takes only mandatory argument referring the address of the cell containing formula that you wish to view. It returns the formula contained in the cell whose address is passed as the only argument to it. In our example, if you enter the formula =FORMULATEXT(B2) in cell F2, it returns the value as =TODAY() as in the below screenshot. Now the formula used in cell G2 and the value returned by it are just side by side making it easier to understand the formula and explain it to others too.

 

image

 

A nifty handy formula for people like me Smile Happy Excel Learning..!!

Trick for Listing the names of all sheets in Excel workbook

List of Sheet Names output

Here is a 2 step technique to retrieve the names of all sheets in Excel workbook:

Step:1. First define a named range using the below formula:

Step: 1a. Click “Define Name” command in “Define Names” group under “Formulas” ribbon tab to invoke “New Name” dialogue

Step: 1b. Let the name of the range be “SheetNames”

Step: 1c. Let the name refer to the range defined by the formula: = Get.Workbook(1)

Step: 1d. Click OK

New Name Dialogue
New Name Dialogue

 

 

Step:2. Write  the below array formula simply referring to the range name defined in Step 1 to get the names of all the sheets in the current workbook listed:

– Since there are 3 sheets in the workbook in the screenshot, select any three consecutive cells in a row.

– Enter “=” to start writing the formula and then type the range name defined in Step.1 as in the below screenshot:

Array Formula referring the defined Name
Array Formula referring the defined Name

 

– Hit CTRL + SHIFT + ENTER , to enter the formula as array formula and the list of all sheet names are listed. See the below screenshot:

List of Sheet Names output
List of Sheet Names output

Note that the sheet names are listed along with the workbook name. You can write a formula using Excel’s built-in functions to extract only sheet name from these names that are qualified by the workbook name.

Happy learning..!! Do leave your comments.

Generating Date Dimension data in Microsoft Excel

Generating Date Dimension data in Excel

Having shared the SQL technique and the script for generating data for Date Dimension for use in self-service BI and Analytics in my , here I share another technique for the same purpose but for those of you who are not familiar with SQL.

Here is the that lets you create Date Dimension data for two years from a given start date. All that you have to do is enter the desired start of the year in date format in cell B3. Rest is taken care that everything refreshes automatically.

Hit the like stars if you find it useful. Share the post further to spread the learning. Follow the analyst arsenal for effortless continued learning. Happy Learning..!!

Oracle SQL script to generate Date Dimension data for use in Microsoft Excel Data Models

Date dimension is that one dimension or business perspective without which an Excel Data Model cannot ever exist. Here is an Oracle SQL script to generate the data for a Date Dimension. All that you have to do is change the start date and end date values in the in the last line of the below SQL query and execute it. Try it and experience the magic..!! Please leave your suggestions, comments and any enhancements that you might want to see in this.

Here is the downloadable script on Oracle LiveSQL:

/***********************
Description: Date Dimension Data Generation
Date Created: 14/Nov/2016
Author: Balakrishna Sappa
************************/

CREATE OR REPLACE TYPE DATE_OBJ AS OBJECT(DATE_KEY VARCHAR2(8), DATE_DT DATE, DATE_TYPE VARCHAR2(10), DAY_OF_MONTH SMALLINT, CAL_DAY_OF_YEAR SMALLINT, DAY_OF_WEEK_FULL_NAME VARCHAR2(9)
, DAY_OF_WEEK_SHORT_NAME CHAR(3), CAL_MONTH_FULL_NAME VARCHAR2(9), CAL_MONTH_SHORT_NAME CHAR(3), CAL_MONTH_ENUM SMALLINT, CAL_QTR_NAME CHAR(2), CAL_QTR_ENUM SMALLINT, CAL_HYR_NAME CHAR(2)
, CAL_HYR_ENUM SMALLINT, CAL_YEAR INT, START_OF_MONTH DATE, END_OF_MONTH DATE);
CREATE OR REPLACE TYPE CALENDAR_DATES IS TABLE OF DATE_OBJ;
CREATE OR REPLACE FUNCTION UDF_CALENDAR_GENERATOR(START_DATE IN DATE, END_DATE IN DATE)
RETURN CALENDAR_DATES
IS
CALENDAR_DATE CALENDAR_DATES;
BEGIN
SELECT CAST(MULTISET(WITH RCTE_DATES (DATE_DT)
AS
(SELECT START_DATE AS DATE_DT
FROM DUAL
UNION ALL
SELECT DATE_DT + 1 AS DATE_DT
FROM RCTE_DATES
WHERE DATE_DT + 1 <= END_DATE
)
SELECT CAST(EXTRACT(YEAR FROM DATE_DT) AS CHAR(4)) || SUBSTR(’00’ || EXTRACT(MONTH FROM DATE_DT), -2,2) || SUBSTR(’00’ || EXTRACT(DAY FROM DATE_DT), -2, 2) AS DATE_KEY
, DATE_DT
, ‘DATE’ AS DATE_TYPE
, EXTRACT(DAY FROM DATE_DT) AS DAY_OF_MONTH
, CAST(TO_CHAR(DATE_DT, ‘DDD’) AS SMALLINT) AS CAL_DAY_OF_YEAR
, TO_CHAR(DATE_DT, ‘DAY’) AS DAY_OF_WEEK_FULL_NAME
, TO_CHAR(DATE_DT, ‘DY’) AS DAY_OF_WEEK_SHORT_NAME
, TO_CHAR(DATE_DT, ‘MONTH’) AS CAL_MONTH_FULL_NAME
, TO_CHAR(DATE_DT, ‘MON’) AS CAL_MONTH_SHORT_NAME
, EXTRACT(MONTH FROM DATE_DT) AS CAL_MONTH_ENUM
, ‘Q’ || TO_CHAR(DATE_DT, ‘Q’) AS CAL_QTR_NAME
, CAST(TO_CHAR(DATE_DT, ‘Q’) AS SMALLINT) AS CAL_QTR_ENUM
, ‘H’ || CASE WHEN EXTRACT(MONTH FROM DATE_DT) IN (1,2,3,4,5,6)
THEN 1
ELSE 2
END AS CAL_HYR_NAME
, CASE WHEN EXTRACT(MONTH FROM DATE_DT) IN (1,2,3,4,5,6)
THEN 1
ELSE 2
END AS CAL_HYR_ENUM
, EXTRACT(YEAR FROM DATE_DT) AS CAL_YEAR
, TRUNC(DATE_DT, ‘MONTH’) AS START_OF_MONTH
, TRUNC(ADD_MONTHS(DATE_DT, 1), ‘MONTH’)-1 AS END_OF_MONTH
FROM RCTE_DATES
UNION
SELECT ‘999999’ AS DATE_KEY
, TO_DATE(’31-DEC-9999′, ‘DD-MON-YYYY’) AS DATE_DT
, ‘NA’ AS DATE_TYPE
, 0 AS DAY_OF_MONTH
, 0 AS CAL_DAY_OF_YEAR
, ‘NA’ AS DAY_OF_WEEK_FULL_NAME
, ‘NA’ AS DAY_OF_WEEK_SHORT_NAME
, ‘NA’ AS CAL_MONTH_FULL_NAME
, ‘NA’ AS CAL_MONTH_SHORT_NAME
, 0 AS CAL_MONTH_ENUM
, ‘NA’ AS CAL_QTR_NAME
, 0 AS CAL_QTR_ENUM
, ‘NA’ AS CAL_HYR_NAME
, 0 AS CAL_HYR_ENUM
, 9999 AS CAL_YEAR
, TO_DATE(’31-DEC-9999′, ‘DD-MON-YYYY’) AS START_OF_MONTH
, TO_DATE(’31-DEC-9999′, ‘DD-MON-YYYY’) AS END_OF_MONTH
FROM DUAL
)
AS CALENDAR_DATES
)
INTO CALENDAR_DATE
FROM DUAL;
RETURN CALENDAR_DATE;
END;

Below is as screenshot of the results of this SQL executed in Oracle’s fee online SQL learning and execution tool: Oracle Live SQL.

datedimensiondatagenerator

 

A White Paper – Solving Relational Division problem in SQL – The Analyst Way

Hi readers..!!

Here I go posting a white paper on one of the classical and intellectually challenging problems that an analyst might come across. The paper is titled – “Understanding and Solving Relational Division problems in SQL – The Analyst way”.

Hope you enjoy reading this paper.

Please leave your comments and suggestions, if any, here.

Will be back with more stuff. Stay tuned to the blog. Until then, happy learning..!!

 

A Use Case of CROSS JOINS in SQL

Use case of SQL CROSS JOIN: Although CROSS JOIN is available in SQL, you rarely find any instance where you can use it. Here is one use case for CROSS JOIN in SQL:

Example with fictitious data: Assuming a tournament of Soccer to happen among four teams, what are the all possible matches between each of these teams, such that each team gets a chance to play with the remaining other three teams..?

You can use the CREATE and INSERT statements in the below screenshot 1 to create this table. Then run the SQL SELECT statement as in the screenshot 2 to get the desired results:

Screenshots:

image

Here is the SQL script containing all the statements that can be copied: SQL Cross Joins Use Case

Caveat: The CROSS JOIN here actually returns all permutation of the matches which includes the repetition of the combinations giving significance to the order of the teams in a match. i.e., a match between Brazil Vs Italy and Italy Vs Brazil will be treated as separate.

Solution: The predicate in the WHERE clause takes care of this and avoids any such repeated combinations and lets you disregard any significance attached to the order of the teams in a match.

Mathematical formula for Combinations

Happy learning..!! Don’t forget you leave your comments friends..!! Smile

A quick trick to create in-cell charts in Excel

Here I share a tricky way to create in-cell charts in Excel with an example. Below is the data of a few employees and the respective volume of requests completed by them.

Excel_InCellChart

Now enter the below formula in cell C2 and copy it till C8.

=REPT(“n”, B2/50)

 

 

 

 

 

 

Column C now looks as in the below figure:

Excel_InCellChart_3

Now, select the column C and then change the font face to “wingdings” (Home tab –> Font group –> Font face) and font color(Home tab –> Font group –> Font color) to a color of your choice (say blue).

 

 

That’s it..!! the in-cell chart now looks like the below in column C:

Excel_InCellChart_2

Explanation: The REPT function in Excel takes the 1st parameter to it and repeats it for required number of times indicated by the 2nd parameter to it. In our example, REPT takes the character ‘n’ and repeats it. The number of times it is repeated is indicated by the volume divided by a constant number of your choice. I am dividing the volumes by a constant number 50 to let the character ‘n’ appear for a small number of times but still retaining their proportionality to other volumes. This is required to make the chart appear compact.

Hope you enjoyed reading this post..!! Want to master Excel with more such techniques..? Follow my blog by entering your email and then hitting on FOLLOW button at the top right or simply click on the FOLLOW button at the bottom right of the window, if it appears in your browser.

Do leave your comments or questions, if any. Happy learning.. Smile

Truncating time part from date & time data in Excel

Did you ever come across a situation where you had to compare two dates ignoring their time part to arrive at a status indicator ? Let’s see an example and the trick to address it:

Example Scenario: Here you have some order ids and their respective order dates, target dates and delivery dates as in the below screenshot. Now, you are asked to arrive at the order status using a business rule that states – “If the delivery date is on or before target date, disregarding the time of the day, then update the status as ‘Target Met’ else update it as ‘Past Due’.”

Excel_TruncateTime_1

so the business rule when translated into Excel formula, looks like this: =IF(D2<=C2, “Target Met”, “Past Due”)

When this formula is entered in cell E2 and copied down till E8, the order status looks as in the below figure:

Excel_TruncateTime_2

As per the business rule, the comparison of Order Date and Delivery Date should ignore time part of the day. Irrespective of what time of the day the order/product has been delivered, if the delivery date falls on the target date, the status should be updated as ‘Target Met’.

However, the status’ for Order IDs 12126 and 11231 are updated as ‘Past Due’ although they are expected to be updated as ‘Target Met’.

Challenge: While comparing Order Date and Delivery Date, Excel is comparing the complete date and timestamps.

Solution: To let Excel consider only dates and ignore timestamps while comparing, truncate the timestamps from dates using INT function, before doing the comparison. So the formula becomes:     =IF(INT(D2)<=INT(C2), “Target Met”, “Past Due”)

Now the result appears as desired as in the below figure.

Excel_TruncateTime_3

Explanation: All dates and timestamps in Excel are actually serial numbers. 1-Jan-1900 is treated as 1, 2-Jan-1900 is treated as 2 and so on. While Excel also stores timestamps as fractional numbers between 0 and 1. 0 represents 12:00:00 AM (start of the day) and 0.999 represents 11:59:59 PM (end of the day).

Using this process, in our example, for the Order ID 12126, the numerical representation of Order Date becomes 41443.66 and for Delivery Date, it becomes 41443.68. When these two values are compared, Delivery Date (41443.68) is not less than Order Date (41443.66), so the status is updated as ‘Past Due’.

INT() is a function in Excel that takes a number as input and returns the number rounded down to the nearest integer. i.e., for Order ID 12126, INT(D4)<=INT(C4) becomes INT(41443.68)<=INT(41443.66) which evaluates to 41443<=41443 which is TRUE. Hence the status gets updated as ‘Target Met’.

Hope you enjoyed reading this post..!! Want to master Excel with more such techniques..? Follow my blog by entering your email and then hitting on FOLLOW button at the top right or simply click on the FOLLOW button at the bottom right of the window, if it appears in your browser.

Do leave your comments or questions, if any. Happy learning.. Smile