Aron Roberts
It's late in 1999. You're responsible for Y2K fixes on some of your department's PCs and Macintoshes, and you think you've checked everything. You've found and fixed the hardware problem on some of your PCs which could cause their system dates to revert to 1980 when January 1, 2000 arrives. You've applied Microsoft's Y2K fixes to the Windows 95, 98, NT 3.5.1, and NT 4.0 operating systems. You've identified the critical application programs on these computers, checked their vendors' websites for Y2K statements about these programs, and installed vendor-recommended updates or upgrades that resolve their Y2K problems. You've also found and fixed Y2K problems in the source code of any critical, customized programs or scripts that may be running on your department's servers or workstations.
Is there anything else you need to check? Yes. In addition to hardware, operating systems, and application programs, there's a fourth area of Y2K concern: data. The following list highlights some key examples of data-related Y2K problems you will need to check:
Y2K issues during data entry and interchange. To check for these problems, identify the ways that you and others in your department enter dates into documents and exchange dates between programs. Key problems to watch for include:
These types of problems can be addressed by educating users and identifying situations in which dates are exchanged using century-ambiguous two-digit years.
Y2K problems within data files. To check for these problems, scan and analyze critical databases, spreadsheets, and other data files. Key problems to watch for include:
How common are Y2K problems within data files? The only "real world" number we've seen to date is from Allen B. Falcon, president of the Boston-based consulting firm Horizon Information Group, Inc., who mentioned in a posting to a Year 2000-related mailing list that several of his client organizations who endeavored to scan and analyze all of their data files found that less than "2% of all files required remediation."
Many of us are in the habit of entering dates in "shorthand", with only two-digit years. For instance, we're used to entering a date like "11/20/98" to represent November 11, 1998. In some instances, however, this practice can lead to your programs assigning such dates to an incorrect century. Users of spreadsheet, database, and statistical programs, in particular, should be made aware of this risk and given information about how to minimize it.
This problem occurs because, when you enter a date with only a century-ambiguous two-digit year, many application programs will essentially take an educated guess at what century -- the 20th or 21st -- you "probably meant". Typically, this is done using "windowing" rules that merely assign dates to one century or another based on whether their two-digit years fall before or after an arbitrary "pivot year". Unfortunately, the rules for doing so often vary between programs:
As a result of the variability of these rules, the potential exists for some dates with two-digit years to be assigned by your programs to a different century than you intended. If you don't notice that this has occurred, incorrect dates could be stored in your data files.
As we move into a period in which any date entered into a data file might reasonably fall either within the last century or the new century, we encourage you to promote the practice within your department of always entering dates with full, four-digit years. A document on the UC Berkeley Year 2000 Information website, Why You Should Enter Dates With Four-Digit Years Whenever Possible (http://y2k.berkeley.edu/computers/fixpcs/issues/enter-4digit-years.html), discusses this topic in more detail. As a less satisfactory alternative, users who enter two-digit-year dates should become keenly familiar with the rules that their critical programs use when assigning such dates to a particular century, and should visually verify that such assignments have been made correctly.
Finally, for cases in which your programs offer such features, setting up input masks or validation filters to require that critical dates be entered with four-digit years might be warranted. In addition, you might combine this with other traditional methods of data validation, such as checking to ensure that dates are entered within valid ranges.
Dates with two-digit years are at risk of switching centuries when you exchange data containing such dates between two different programs (or even, in some cases, between two different versions of the same program). This could occur in any situation in which you use programmatic data exchanges, text files, other "intermediate" file formats, or even the clipboard to export data from one program and import it into another program.
This problem occurs because two-digit-year dates are century-ambiguous: "05/13/29" might reasonably represent either May 13, 1929 or May 13, 2029. The two programs between which data is being exchanged might, in some situations, use different methods for assigning such dates to the 20th or 21st centuries, which could cause some dates to switch centuries as a result of being exchanged.
Here's a possible scenario of how this could occur:
Thus, because a two-digit-year date was exchanged between two programs, a date that was originally stored as May 13, 2029 in an Excel spreadsheet was changed to May 13, 1929 when it was imported into a FileMaker Pro database.
To find such problems, you'll need to identify any circumstances under which you -- or the departmental computer users with whom you are working -- exchange critical date-containing data between programs. In some cases, this could occur on a regular basis, while other data exchanges might occur periodically or even just occasionally. From there, you'll need to identify whether two-digit-year dates are exchanged as part of this data. A document on the UC Berkeley Year 2000 Information website, Data Sharing Methods: Finding & Resolving Y2K Problems (http://y2k.berkeley.edu/computers/fixpcs/issues/data-sharing-methods.html) provides further, general discussion of this issue.
Many spreadsheet and database programs offer built-in functions for handling dates. Such functions, including Microsoft Excel's DATE(), Microsoft Access's DateSerial(), and Lotus 1-2-3's @YEAR(), may not always return the results you're expecting when used with dates in the year 2000 and beyond. These built-in date functions can be especially tricky when used with century-ambiguous two-digit-year dates such as "00" or "01". If these date functions are being used in formulas, macros, or scripts without a complete understanding of how they handle such dates, incorrect data could be introduced into your spreadsheets or database files.
Two examples:
DATE() function, the formula =DATE(05,13,00) will be interpreted as May 13, 1900, rather than May 13, 2000, as some Excel users might be expecting. This function interprets all two-digit-year dates as falling within the 1900s.
To further complicate matters, with Excel for the Macintosh, whose worksheets by default use a date system that begins in 1904, the formula =DATE(05,13,00), and more generally, the DATE() function, when used with any two-digit years in the range from "00" through "03", will usually generate the error value #NUM!.
@YEAR() function in at least some versions of Lotus 1-2-3, when the function's sole argument is a reference to a cell containing a date in the year 2000, it will return the three-digit value 100, not the four-digit year 2000 that some 1-2-3 users might be expecting.
To identify these problems, visit your vendors' websites and read the Y2K compliance statements for your critical applications carefully. Most vendors will detail any Y2K issues with their programs' built-in date-related functions. Your department's spreadsheet and database users will need to be aware of these issues when creating new files or modifying existing files containing such date functions.
In addition, you can use a variety of software tools to help you identify potentially problematic uses of date functions in your department's critical spreadsheets and databases. Some of these tools are listed on the UC Berkeley Year 2000 Information website, in Software Tools for Finding & Resolving Y2K Problems (http://y2k.berkeley.edu/computers/fixpcs/tools/). Another document on this website, Data Files: Finding & Resolving Y2K Problems (http://y2k.berkeley.edu/computers/fixpcs/issues/data-files.html) presents a general approach to resolving Y2K problems in data files.
Another category of potential Y2K problems in data files involves custom programming. Many database and spreadsheet programs, in particular, offer a variety of ways to write custom programs which are either stored directly within, or can act upon, data files created by these programs. Such custom programming might appear in a variety of formats, including formulas, functions, macros, and scripts.
If such custom programming handles dates in any way, it is at risk for harboring Y2K-related or other date-related problems. This could occur, for example, if it uses built-in date functions of the database or spreadsheet program. As noted above, these functions can sometimes return unexpected results when used with 21st century dates. In addition, custom programming code might incorrectly handle dates in other ways, such as in operations to compare dates or to calculate the interval between dates.
The Customized Applications portion (http://y2k.berkeley.edu/cgi-bin/frame.cgi?/customizedapps/index.html) of the UC Berkeley Year 2000 Information website provides extensive information about finding and resolving problems in custom-written or highly customized applications, including applications created using some popular database software for personal computers. As an alternative to manually scrutinizing source code, some software tools can help you identify potential problems with custom programming in your critical spreadsheets and databases. A number of these tools are listed in Software Tools for Finding & Resolving Y2K Problems (http://y2k.berkeley.edu/computers/fixpcs/tools/).
A third key area of concern is "dates" which are actually stored in text or numeric formats within your data files. For instance, some files could contain "dates" with century-ambiguous two-digit years, such as "05/13/29", stored within a database's text or numeric fields or a spreadsheet's text-formatted cells.
Y2K problems can potentially arise when such dates stored as text or numbers are sorted, compared, or otherwise processed by programs, scripts, or macros. In addition, there are known Y2K issues with some built-in date functions that convert text to dates, such as Microsoft Excel's DATEVALUE() and FileMaker Pro's
TextToDate().
In some cases, you may be able to detect certain of these problems via manual scanning. For example, you may be able to manually identify fields in database files in which dates have been stored as text or numbers simply by examining their data structures and field names or by examining sample records. Some software tools can also assist you in identifying places in which "dates" have been stored in your files as text or numbers, as well as related problems with built-in date functions and custom-written scripts or macros.
[ Next Article | Contents | Search BC&C | BC&C Main Menu | IST | UC Berkeley ]
Berkeley Computing & Communications,
Volume 9, Number 5 (November-December 1999)
Copyright 1999, The Regents of the University of California