With Mother’s Day quickly approaching, there’s no doubt that flowers and greeting cards will once again be in high demand. After all, recognizing the mothers of the world (over 85 million in the U.S. alone, according to the Census Bureau) is near and dear to all of our hearts. Mothers bring things like love, kindness, and compassion to our world. They provide a safe and tidy household in which kids can grow and flourish.
Believe it or not, securing a sound household rings true for even something like master data. Similar to kids, data must also be groomed and shaped in order to perform well when called upon in the future for a specific purpose. Grouping related records that originate in different sources poses quite the challenge, despite the perception of ease at first glance. During the ETL process, householding presents a formidable task, if not the most difficult.
So what is it exactly? Well, householding identifies data from one source that is in some way related to data at another source. The combined related data is stored in a data warehouse where its entirety can be viewed, offering a more “complete” picture of a real-world entity. Members of the sales department can view marketing statistics, and vice versa.
What sort of scenario does householding address? A very common scenario involves a multi-generational family that lives in one dwelling. Perhaps the son and father are both customers of a particular bank. Should a relation not be found or recognized between them, both entities will likely receive a piece of direct mail, adding to the cost of the campaign. With today’s campaigns going beyond just separate members of the same family, householding may also address multiple classifications of the same individual. This person could very well be the head of a household, a single parent, retired, etc.
That’s simple enough, so why is it so difficult?
The first hurdle to get past involves the quality of data. We all know that data comes in all shapes, sizes, and ages. Going beyond the regular misspellings or data populating a field it shouldn’t, how current is the data? Updating things like address databases is not an overnight process, plus who knows if the individual who moved even notified the local post office.
How is data stored? This still pertains to data quality, but more on the side of master data consistency. Are the fields identical from one source to another? Are phone numbers or addresses all stored according to normal standardization (separated fields)? Take the first name Katherine, for example. A vast number of variations can sprout from this name, creating potential for the same entity to be represented in different ways:
Ok, the last one was a stretch, but you get the point that morphing of names becomes a data quality villain. The cause could have been as simple as what mood the real-life entity was in the day they filled out a company’s questionnaire, and/or how the data was interpreted when entered the system. The effect is clear, though. If the company’s data quality rules are not set up to engage this potential issue, the grouping of that data creates more questions and uncertainty when matching takes place.
That leads to matching rules and attributes. How do the rules remove duplicates and then group data between multiple systems? We saw that variety in names alone can cause headaches. Will Katherine Smith match up to Kate or Katie Smith? More fields are often involved to help confirm a match to a group, address fields being the common first choice. Addresses that aren’t current also offer resistance to accuracy, so business planners may choose to go further, such as including email addresses. Social security numbers would work great too, but do all source systems within an enterprise possess this field? Another consideration is the time required to process, because logic tells us that processing time increases with each additional field used for matching purposes.
Another point to consider is to what end you wish to group records, which is fueled by the desired sales or marketing effect of the campaign. Should the matching be based on physical residences (dwellings), groups/clubs to which the entities belong, or simply other family members? How do you want to handle family members at that location with different last names?
Marketing demographics will likely affect what sort of matching rules are established. If a majority of the neighborhood’s inhabitants are single, matching will probably involve name and address, but for areas where families reside in one dwelling, the name field might be omitted. An additional factor is the underlying reason for the campaign. Do you merely need to distribute a simple message, or send a request for response from each individual, regardless of their relation to others within the household?
Follow the example of a good household
The impact of a proper householding process is directly felt in every sales or marketing campaign a company launches. And the effect doesn’t stop there. Remember, householding groups data from all source systems to provide a more complete picture. An entire enterprise can benefit from this, especially employees involved with CRM. Budgets are slashed and efficiency goes up, all thanks to good householding processes.