Percentile Analysis – Qlik Sense

Last week I was tasked with developing a dashboard to show how well an employee was performing over time relative to their peers across many metrics. Bringing all of these concepts into a single chart took several iterations but I believe the result could be applicable to many people. In the sample screenshot, we can evaluate Employee D’s performance each quarter relative to several peer benchmarks.In Q2 and Q3, D performed below the worst peer group, but in Q4 they improved. This is based on random data, but this type of analysis is highly valuable for managers to understand and coach employees appropriately.

After many months of building in Qlik, I was able to incorporate several of the complexities available in Qlik to create this chart. It uses the Aggr function to separate the data into appropriate sub-sets to be processed. Set analysis allows for evaluation of the entire set of employees for the percentile (fractile) calculation.

Development hit one road block where I was attempting to incorporate the time dimension. This was solved by including the exact type of time (month, quarter, or year) into the Aggr function.

Here is the recipe:

Variable to create many percentiles:

eFractile = Fractile({<Employee={*}>} Aggr(Sum({<Employee={*}>} Sales), Employee, Quarter),$1)

Chart – Combo Chart

Dimension: Quarter

Measure 1 [Bar Type]: Sum(Sales)

Measure 2-n [Line Type]: $(eFractile(0.25))

https://charts.qlikcloud.com/5839fa596000fbff00d4d1ce/chart.html

First Meetup in SF

I just got my boxes unpacked and accomplished the first thing I wanted to do after moving to San Francisco: Attending my first Meetup. The number of meetups available on meetup.com has exponentially increased as I move into the home of open-source software and companies who want to host events. Free food, free beer, and best of all, free learning.

From my experiences on Wednesday, I was able to get a clear picture of how companies create a formal data science infrastructure on the largest scales. MyFitnessPal presented their views and components that power their companies real-time, batch, and business intelligence units. Everything I’ve read and learned over the past year-plus has been realized. If I were to offer advice to anyone searching out the best of the best, it would be to move to the places in the world that house the best of the best. I can’t wait to hear more from other industry leaders.

Here’s a link to the slideshare

The Future of Visualization

“The Future of Data Visualization” Jeffrey Heer

Today in my adventures on the internet I stumbled upon a treasure trove of data visualization information. Below are links to an amazing video talking about “The Future of Data Visualization” by Jeffrey Heer which in itself is a comment about the past, present, and future use of visuals. The conclusion that the speaker comes to is that there are visualizations that are objectively easier to perceive, than others. As an aside, the speaker uses data to show how its best to show data, Meta!.

The analysis shows that position and length are simpler than to analyze than area and color hue. This has dramatic implications when creating visuals. At the end of his talk, the speaker mentions that his company Trifecta suggests what form a visual should take in order for the analyst to quickly understand the implications of the data.

One other note is that the speaker talks about his class that he teaches as he gives an example. A little internet sleuthing and knowing that UW doesn’t put lectures behind a log-in barrier led me to his course website. It has over 1000 slides of amazing data visualization content. The link to the course and its materials is also listed below.

Video: “The Future of Data Visualization” – Jeffrey Heer

Course: UW CSE 512 – Data Visualization

Lectures: CSE 512

(Also take a look at anything from Strata + Hadoop, the entire conference is on Youtube)

Press X to JSON

Source: Press X to Jason (Heavy Rain / Music Video) – YouTube

One happy memory from my Freshman year is watching someone play “Heavy Rain” in my friend’s dorm room with 10 other people. It is a mystery game that has the lead character searching for his son. During an opening scene you notice your son is missing, and you call out to him “JASON!!!” The Press X to Jason meme was born. Watch here for more information.

This happy memory was triggered as I’ve gone further down the rabbit hole of data science. JSON for the uninitiated is one of the most common and friendly file formats a majority of API information is sent back in. APIs are how information is requested and distributed by major companies. My new goal is to learn how to press x to JSON.

A few weeks ago as I was watching videos on proper data visualization, I stumbled upon a recent high-profile data visualization. For the full visual, go here. In this visual, Mike Barry and Brian Card show patterns and interesting features of a data set they built from the Boston’s Massachusetts Bay Transit Authority (MBTA). I have been wondering how huge data sets are created by individuals and this answered my question. The two creators assembled this data by repeatedly requesting the information from the MBTA API. They stored this information in giant data frames. They knew how to press X to JSON.

One of the best communities of programming I have found is the open source world. I can see how amazing projects are built. This allows me to learn so much faster than I could. The MBTA VIZ projects github repository can be found here. The data, java, html, and css files are all there.

I’ve spent the last week learning how to showcase work on the internet by building websites. Now I want to learn to press X to JSON and build something. It may not be as good as what Mike and Brian created, but it’ll be mine.

First step: Find an API. The best place to find APIs is programmableweb.com.

Next: Analyze the data to find something interesting to talk about.

Finally: Publish using R Markdown for a technical audience.

Extra Effort: Publish something pretty and interactive similar to the MBTA visualization.

Dare to be Great

I want to be Elon Musk. I want to be Dick Costolo. Fortunately for me, the path to getting there is laid out in this post by Musk’s ex-wife. As someone who has always looked to others for role models, I have continuously looked at these people as my heroes. I aspire to be them every day. The end of Justine Musk’s answer hit me hard. She states “Don’t follow a pre-existing path, and don’t look to imitate your role models.” Damn…

I’m never one to stay down though. I’d like to offer a synthesis of Justine’s caricature of the tech greats with my world view. Here are the main points I’d like to keep from her quora answer and how they mesh with what I believe:

The most important piece to keep from her answer is to “Be obsessed. Be obsessed. Be obsessed.” After going through both periods of intense focus and times of directionless meandering I have come to see that the truly inspirational do not follow a straight path. They merely go a hundred miles an hour in anything that marginally sparks their interests. An example: if you watch Dick Costolo in interviews, you realize just how smart he sounds off-the-cuff. These skills came from his time as an improv comedian. That’s right, the CEO of Twitter was a comedian (check here if you don’t believe me).

Another aspect of Justine’s answer is to handle high-levels of stress and uncertainty. Anyone who is operating at the frontier must take risks and have ownership over their failures. Justine says “They will experience heroic, spectacular, humiliating, very public failure but find a way to reframe until it isn’t failure at all.” Spin is amazing. (For a tutorial on spin, watch Thank You for Smoking.)

I have always bought into the philosophy that a person is not able to take on the world alone. “They seek partnerships with people who excel in the areas where they have no talent whatsoever.” People, products, and companies are only able to be great when supported by the efforts of others. We praise the individual because we love a super hero, but it is only with the support of the crowd that the super hero is able to be elevated.

I do disagree with her: if Elon Musk was starting out right now, he would constantly be surfing the net. Books continue to be a fantastic resource, but they are only one of many. Open courseware allows access to courses from the premier institutions. Coding can be learned online. Information becomes dynamic. The opportunity to learn is in the aggregation of all media sources. The important takeaway is to have an insatiable thirst for knowledge.

I still stand by my thoughts on ideation. Nothing is truly original. I like my role models. I like emulating the greats. Even if they are still alive. I want to be Elon Musk. Eventually…

Photo Source: tuftsrecycle

Intentional Data Analysis: “Know before you go”

Lean Analytics – Ken Norton (Google Ventures)

Watch this video. It is a fundamental staple from my self-guided education over the past year. After learning from Ken’s excellent presentation, I was able to think about how to properly design my data analysis so that every metric that I collected, analyzed, and visualized would have some actionable piece of information realized. The beautiful business term of INSIGHT!

A personal example I made Excel macros at work as a majority of my work function involved formatting and thus was ripe for automation. I packaged these macros into an add-in that allowed the rest of my team to easily load the macros into their instance of Excel and click buttons. After watching “Lean Analytics”, I made a call that recorded the main metrics I was interested in: the computer name of whoever was using the add-in and what macro they were running. By collecting this information I was able to tell my boss that I was saving cumulative about a third of a team member’s time. I can also see which macros have had the most effect. By recording this information I have been able to focus my attention on where I can have the most impact.

The ability to gain these insights is limited by creativity during both analysis AND planning. To make life easier, think about creating your databases, tables, and other data structures with the intent that you are going to query for specific, actionable information. The more time you spend properly designing your data storage will pay off exponentially during the rest of the business’ life. If you don’t know about data storage best practices, you should google them.

Photo Source: Simran Jindal

Data Dashboards, So Hot Right Now!

TDomooday I saw a company that was so beautiful I almost cried. Domo has come out of the shadows to unveil a $2B business with a product suite both complete and comprehensive. Check out the videos, testimonials, and examples on their website (www.domo.com).

What I truly love about this company, as well as others in the data analytics industry, is that they are taking some of the most complex problems and boiling them down so that anyone can harness the power behind data. Taking Domo as an example, they are able to give insights in areas that people did not realize were possible. For SAB Miller they were able to aggregate all of the brands and show the C-suite executives exactly what the whole company was experiencing. The ease of setting up graphs and charts to show a company goes along with my belief in the power of infographics. If everyone could be given the power to create a story using data, the world will become a better place much faster.

Another great company that I was exposed to recently is Quid. This company aggregates, filters, and assembles giant webs showing all of the links between different publications. My friend who is a patent analyst uses it to see every single patent for a company and map out the exact positioning of the portfolio. Reading 7000 patents for a company would be impossible, but Quid allows you to do it almost instantaneous while producing a visualized model to easily understand all the information. The platform also works with news aggregation. It’s beautiful.

Looking at the way that each of these companies builds data products is inspirational. Taking their ideas and thinking about how to use them in analogous ways could lead to equally if not more extraordinary products. See my post about innovative thinking: ideation for how to use these companies as a valuable resource.

Some other great companies:

Palantir

Tableau

Photo Source: Domonation

Innovative Thinking: Ideation

Where Good Ideas Come From (TED Talk)legos

What is Original (TED Talk Radio Hour)

Embrace the Remix (TED talk)

“If I have seen a little further it is by standing on the shoulders of Giants” – Sir Isaac Newton

I believe the idea of being an inventor is inherently flawed. There are very few entirely “new” ideas. Humanity has spent millennia working to get to this point in time. If I believed that I have the capability to build something from scratch with no outside influences, I’d be a fool. We did not get here without building off of those that came before us. So when considering ideation, the best approach is to look at examples and build on top or improve upon what has been proven great.

This may not be the most common belief in the tech community, where people frequently believe that services and products disrupt marketplaces by being original. By doing this, we blind ourselves to potentially better process for ideation. Whenever I try to think of something creative, I strive to assemble some of the best ideas from across a variety of industries. In the above linked podcast, the speakers describe how the music and fashion industries combine great historical examples in new ways to create. The same process can be applied to tech ideation.

Let’s look at Uber as an example. It is arguably the biggest, most disruptive startup we have seen in the past 5 years. The world is applauding Uber for coming into a stale market with a brand new product. But Uber is merely a combination of the many pieces of software that support it. The company uses Google Maps to navigate and coordinate its drivers, the device’s GPS to locate both car and passenger, a payments processor, and many more components. Each of these were once stand-alone products, now they are being used as building blocks.

Another method for ideation is to look at other industries and imagine some way to use the original idea in a new way. One of my favorite examples of this is in marketing. An example from the best product growth podcast ever (skip to 40:50): Jared Fliesler talks about the unique way that Intuit marketed Quickbooks. Give it a listen as I can’t do justice to the phrasing.

My final piece of advice on ideation is that creativity requires constraints. Imagine having someone tell you to be creative. No rules, no guidelines, they just say “be creative, show me something.” Now imagine that instead, they had handed you a pen and paper and said be creative. This completely changes the types of things that you may create. Expanding this argument to the tech space, look for ways to create constraints on projects or ideation times. These can be artificial such as narrowing the goal to a specific industry or outcome or they can be imposed upon you by system limitations or physical constraints. A great way to do this is to look at a building block, such as the Google Maps API, in order to build something. Who knows, maybe you’ll make the next Uber.

Photo Source: Wikipedia (Creative Commons)

Infographics are Cool

The long form website is in vogue. For those who don’t know, long form is a way to describe the infinitely scrolling (or seemingly infinite) pages that companies have begun to use. These are mostly ways to create a story behind their product and help the user understand what benefits they might get. An unexpected bonus to society is the new importance of infographics. These short, data-rich features showcase the benefits of the product in ways that text cannot explain.

I bought a book to help me understand how to design infographics and what goes into them. I found it in an art/architecture/design store and have combed through only a fraction of the pages. As I find beautiful examples, I’ll be sure to update this post and include an analysis on what makes them so effective. For now, my goal is to combine the skills I’m gaining in R with the story-telling prowess of an infographic. More to come.


Here’s an interesting blog that showcases infographics:

www.coolinfographics.com

How to Build a Website (infographic)

A First Foray at R

The Grammar of Data Science – Deep GanguliR-Programming

I was inspired by the above article to step up from the simple analytical tools of Excel and Access that are all too prevalent in the business world (thanks Microsoft…). After debating between learning a basic coding language such as Python or Ruby, I decided that a specific mathematical and data manipulator would be the best bet. Now I’m in love with R. I’ll update this post as I continue to find new resources for learning the language but for now, here is what I’m using:

Datacamp – this is a fantastic resource for learning R basics. If you know anything about data analytics and visualization, the modules go quickly and you get immediate feedback on whether you are picking up the coding language.

Photo Source: Rprogramming.net