IRM Guideline 9, Version 1
DOCUMENT OVERVIEW
Purpose for This Document
The purposes of this document are to: 1) inform managers and decision-makers of the need for a good datanaming program, 2) explain the business benefits of a good data naming program, 3) show how data naming fits within a larger data administration function, and 4) identify action steps that enable agencies to begin realizing the benefits of a data naming program.
Using This Document
This document is organized into four sections:
Audiences for This Document
The intended audiences for this document are executives, managers, decision-makers and data administrators with responsibility for an organization's data resources. Data base administrators, warehouse or repository implementers, systems and business analysts and those preparing vendor contract performance requirements may find this document useful for background or general information about data naming. Technical audiences should also refer to the companion document: "Data Administration: A Data Naming Practitioner's Guide".
Potential audiences are identified in the following chart. The chart identifies why each audience should be concerned with data naming, actions to take, how to proceed, and which sections of this document to read.
POTENTIAL AUDIENCES & SUGGESTED ACTIONS
| Who? | Why? | What? (Actions) | How? (To Proceed) | What To Read? |
|---|---|---|---|---|
| Executives; managers; decision-makers (including CIO's) | Data naming is key to maximizing the value of data resources (and resource investment), sharing data with others, meeting customer data needs, & realizing other business benefits |
|
|
Strongly suggested:
For the technically inclined or interested:
|
| Data Administrators | DA is responsible for the integrity of the organization's data resources. | Read this document if general background, or information about DA or data naming, are needed. |
|
Strongly suggested:
|
| Data Base Administrators; Systems Analysts | Needs to understand the role of DA - and its perspective on data naming; how to work with DA. | Read this document if general background, or information about DA or data naming, are needed. |
|
Optional (but recommended):
|
| Business Analysts | Needs to understand how DA facilitates data sharing and data access; how to work with DA. | Read this document if general background, or information about DA or data naming, are needed. |
|
Optional (but recommended):
|
| Creators of vendor performance contracts for data-related services | Needs to understand the impact of DA, and data naming standards, on future vendor performance contracts. | Read this document if general background, or information about DA or data naming, are needed in order to prepare contracts or contract performance clauses. |
|
Optional (but recommended):
|
Why Should Data Naming Be Important to an Agency?
'Data names' are unique identifiers of data that provide links to information about specific data, or to the actual data itself. A data name identifies data in the same way a person's name identifies a person. However, there is an important difference between data names and people names: there are no naming standards for people to prevent two people from ending up with the same name. Data names, when developed using data naming standards, are unique and accurate identifiers that prevent duplication within a particular environment.
Properly created data names are used to help manage data resources by ensuring integrity (without duplication), providing clarity of meaning, and making data accessible to those who need it through precise identification of the required data. Data naming standards are typically developed and administered by a data administration function within an organization. Data administration is part of the data perspective within an organization's IRM (Information Resource Management) program. Appendix I, pages 1-3 show how IRM and data administration fit together and contribute to the quality of information resources.
Data naming is not a new activity: state agencies already establish data names on a regular basis as part of developing information (data) resources. However a formal data administration program that includes a data naming strategy and standards, is important for consistently achieving the desired business benefits.
An example of the use and value of good data names is the Yellow Pages of the phone book. Unique data names have been developed for use in the Yellow Pages. The "data naming strategy" within the Yellow Pages ensures integrity without duplication by maintaining a category, 'Automobiles', without also maintaining the redundant category, 'Cars'. Those who try accessing information about 'Cars' find a cross-reference to 'Automobiles', instead of a duplicate listing under 'Cars'. Without naming standards, it is unlikely the Yellow Pages could have avoided having some entries under 'Automobiles' and others under 'Cars'.
However the Yellow Pages example is somewhat oversimplified in terms of the data naming problems faced by most large organizations today. A more appropriate analogy might be an effort to standardize all Yellow Pages across phone companies. Since no naming standards existed across phone companies in the past, it is likely there are multiple naming standards in place. A feasible approach to consolidation might be to provide cross referencing between phone books based on a newly developed set of standard names. The new names could then be cross-referenced to any number of phone books, while providing a point of continuity between the books. As long as one started a search from the new standard names, any existing book could be accessed, thus leading to data sharing among phone companies.
Within most organizations today, varied computer systems, data base managers and programming languages have been responsible for a proliferation of data naming standards. In effect, organizations today are faced with the equivalent of multiple Yellow Pages to consolidate, both within their organizations and externally when they share data with other organizations.
A good data naming strategy with proper discipline and management can help with data consolidation by providing a common point of continuity. Good data names also help reduce data costs (especially those associated with data redundancy) and improve the quality of data and other information resources. Data is a valuable asset that needs to be managed and protected like any other valuable resource. The value of data assets can be maximized by keeping data management costs to a minimum while maintaining - and improving - data quality.
Business Benefits of Having a Good Data Naming Strategy
The following list includes some of the business benefits that can be realized through the use of a good data naming program:
DATA NAMING RULES In order to share data between organizations, or between computer systems within an organization, data must be uniquely and accurately identified. Accurate identification ensures that data can be defined in one place, and then shared with, or transmitted to, another place without losing its meaning or clarity. Data meaning and clarity are enforced through data names that have consistent formats and content. Standardized data names are developed based on two types of naming rules: format rules and content rules.
Format rules identify the parts of a data name and how the parts are put together in sequence to form a complete name. Content rules define what each part of the name may (or may not) contain and which abbreviations are permitted. Rules for both are within the scope of this document.
Data Naming Format Rules
Consistent data naming formats ensure data names are always constructed the same way, regardless of who constructed the name, or where it was constructed. Computer systems can be programmed to recognize parts of names that are consistently formatted. Business users or citizens can also access the appropriate parts of data names when searching for key words.
For example, searching for dates is possible when the word 'Date' in a data name is always located in the same position within the name. 'Date of Birth' and 'Birth Date' do not follow consistent formats, thus a computer would have to search for the word 'date' in order to locate both names. Unfortunately this search might also find "dated" and "dateline", depending on how the search instruction was defined. Without consistent formats, those accessing data need to be more sophisticated searchers and even, in some cases, have sophisticated computer expertise to formulate search commands.
Data Naming Content Rules
Consistent content within data names ensures the words used in the names mean the same thing regardless of who constructed the name, or where it was constructed. Consistent content also means the words used in names are as clear as possible. Words used in data names that have ambiguous meanings tend to prevent accurate data identification or comparison with other data.
For example, both 'Birth Date' and 'Birthday' would be allowed without content rules. The computer search for the word 'date' would only find the first name, thus duplication could exist without being discovered. This example also shows a second problem with content due to ambiguous meanings: 'Birthday' might mean only month and day, while 'Birth Date' more clearly includes the year of birth (standardized dates include month, day and year). This above example also shows a violation of format rules: the part of the name (format) that contains date information is not consistent, so a computer could not match words based on their position.
DATA NAMING: STRATEGIES AND MANAGEMENT
Understanding 'Data Administration'
Data naming is typically a function within
data administration
(DA). Data administration has only recently evolved into a unique discipline. Functions now associated with DA were originally part of other disciplines, primarily
data base administration
(DBA). Data administration differs from data base administration in a couple of ways. First, data administration is oriented around an organization's assets instead of the data detail focus of data base administration. Second, data administration has a broader IRM focus, and usually reports to the organization's CIO (Chief Information Officer). Data base administration usually reports within the IS (or IT) development organizations.
As DA evolved, different views of its purpose and scope have surfaced. Some views are functional, while others are organizational, thus making it difficult to compare them. Organizational views tend to be arbitrary, since several can work successfully. Organizational views also tend to be less generic than functional views. For example, in some (organizational) views, data administration directs data base administration, while in other (functional) views the functions overlap. Functional views leave the organizational structure and specific implementation of the functions up to each organization.
This document views the disciplines functionally (as overlapping), rather than organizationally. Some data management functions that typically fall within data administration's area of influence include:
A model for data administration functions is shown in Appendix I, pg. 3. Data naming activities within data administration are shown in Appendix II, pg. 1. The portions of data naming within the scope of this document are shown in Appendix II, pg. 2. Detailed technical aspects of data naming, covered in the companion document, "Data Administration: A Data Naming Practitioner's Guide" are shown in Appendix II, pg. 3.
When Are Data Naming Standards Important?
Data naming standards provide consistency and continuity to data names, whether the names appear on data models or in data bases. Because data names uniquely identify data, naming standards promote a level of data integrity that is important for any data management environment. Data naming standards should generally be treated as one of the "best practices" for data management. However, there are certain areas in which data naming standards are critical to success, that should be among the first priorities for implementation.
Priorities for developing and implementing data naming standards should focus on:
What Is Management's Role In Establishing a Data Naming Program?
To ensure an effective data naming program is in place, agency management should:
Prerequisites for a Data Naming Program
Information technologies and systems in Minnesota government also fall under the jurisdiction of legislation that requires:
To achieve legislative mandates, statewide standards and guidelines:
Policies for data naming
Rules for data name content
- Enterprise data
- Object / data model data
Rules for data name format
- Enterprise data
- Object / data model data
Rules for alternate names
- Enterprise data
Methods for adding to, or modifying, data rules
APPENDICES:
Appendix I:
"Framework for Conducting Business Within an IRM Environment"
"Where Does Data Administration Fit into the Organization?"
"What is Data Administration?"
"Data Naming Activities Within Data Administration"
"Scope of 'Data Administration: A Data Naming Primer"
Appendix II:
"Scope of 'Data Administration: A Data Naming Primer"
© Copyright 2013 MN.IT Services - State of Minnesota