Names are important. We use them every day, whether referring to ourselves or everyday objects. The more descriptive and self-describing a name is, the more it intrinsically conveys meaning and eliminates ambiguity and any unknowns.
But what do I mean by that, and why does it matter?
Consider an ordinary object where the definition was universally accepted: the ‘teeth’ on a sawblade. A different word could have been used, though it would certainly have been more difficult to inform someone about what made a sawblade a sawblade. Defining the word would be required and it’s possible that the explanation would eventually resort to the ‘tooth’ analogy.
This same idea applies to naming and organizing files and data at the onset when building a software solution for an organization. Once you create, gather or start manipulating datasets, they can quickly become disorganized. You can save time and prevent errors by agreeing on a specific naming convention. This will create consistency and eliminate ambiguity, the enemy of efficiency.
Given that most solutions are built in a collaborative team environment, let’s consider a data example to bring this concept to life.
Suppose there is a dataset with point of sale transactions for a gift shop on a large cruise line operator, that sells anything from mugs, to chocolates, to merchandise. A subset of the attributes might include: transaction_id, product_id, amount. However, the attribute ‘amount’ is vague. Even within the cruise line operator, it isn’t obvious what should be represented. Is it the ‘amount’ of items sold or the total dollar ‘amount’ for all items on this transaction or something else entirely?
One way to clear up this confusion is with a comprehensive data dictionary, where each entity that is sold in the gift shop is listed, defined and placed in context. In addition, each attribute of the entity is given the same treatment.
However, if we rename the attribute amount to ‘unit_price_in_cents,’ we have clearly communicated what is being represented and provided a domain for the values. There is an expectation that a ‘units_sold’ attribute should exist in the entity and if not it immediately raises questions. Those intuitions happened because the attributes were named accurately and with precision. The values for unit_price_in_cents can be validated based on the types of products that are sold. If there is a non-integral value and the cruise line operator isn’t selling flowers aboard the ship, then they will need to take a second look for data issues.
Unit price was used in this example, but other concepts such as weight, size and velocity can also benefit from the same treatment. Having the right naming convention can be particularly valuable when the teams involved are global. An attribute named ‘unit_weight_lbs’ is more complete than ‘weight’ when the receiver of the data lives in a country that uses the metric system.
While naming seems like it should be a simple and quick task, it can be easily rushed and overlooked. This can lead to disorganization, confusion and inefficiency when it comes to working with technologies. However, putting intention into each name while considering the audience will provide long-term benefits.
Communication is essential, and establishing an intentional naming practice is a simple way to reduce miscommunication and improve efficiency, thus laying the foundation for a successful software solution.