In SAS (Statistical Analysis System), numeric data is one of the two primary types of data (the other being character data), and it represents numbers that can be used for various calculations and statistical analyses. Numeric data can be in the form of integers (whole numbers), decimal numbers, or floating-point numbers (which may include very large or very small values). SAS uses numeric data to store, process, and analyze quantitative data in a dataset, such as test scores, sales figures, dates, and other continuous variables.
Key Characteristics of SAS Numeric Data Format
- Storage of Numeric Data:
- In SAS, **numeric values are stored in a binary format. They are typically stored as 8-byte (64-bit) values, which means each numeric variable uses 8 bytes of memory. This format allows SAS to handle a wide range of values, from very large numbers to very small ones, including negative numbers and zero.
- Default Format for Numeric Data:
- By default, SAS stores numeric data without any specific formatting applied. However, when you display or print numeric data, SAS applies a format that dictates how the numbers will be represented in the output.
- SAS Numeric Format and Representation:
- SAS numeric data can be represented in various formats, such as standard formats, scientific notation, or date/time formats (for date-related numeric data).
- The numeric value itself is always stored as a floating-point number (with both the integer and fractional parts represented), but it can be displayed in different ways based on the specified numeric format.
SAS Numeric Formats
When working with numeric data in SAS, formats are used to determine how the values will appear in output. These formats define how many digits or decimal places should be shown, how the numbers should be rounded, and whether scientific notation should be used.
Some of the most common SAS numeric formats include:
- Best and BEST. Format:
- The
BEST.
format is one of the most commonly used numeric formats. It automatically adjusts the number of digits displayed to best represent the value, depending on its magnitude and precision. - BEST12. or BEST15. formats are used for large numbers and ensure that the value fits within the width specified (12 or 15 characters). It can represent numbers in both regular and scientific notation.
Example:
data example; num = 1234567.89; format num BEST12.; run;
Output:
1234567.89
- The
- COMMA Format:
- The
COMMA.
format is used to display numbers with commas as thousands separators. This is commonly used for financial and large numbers where readability is important.
Example:
data example; num = 1234567.89; format num COMMA12.; run;
Output:
1,234,567.89
- The
- DOLLAR Format:
- The
DOLLAR.
format is used to display currency values with a dollar sign ($
) and commas for thousands.
Example:
data example; num = 1234567.89; format num DOLLAR12.; run;
Output:
$1,234,567.89
- The
- PICTURE Format:
- The
PICTURE
format is a flexible format that allows you to define a custom pattern for displaying numeric values. This could include specifying how many decimal places to show, whether to include a leading zero, and more.
Example:
data example; num = 1234567.89; format num PICTURE9.2; run;
Output:
1234567.89
- The
- Scientific Notation Format (E or D Format):
- For very large or very small numbers, SAS can represent the number in scientific notation. The
E
andD
formats are used for this purpose.
Example:
data example; num = 0.000000123; format num E12.; run;
Output:
1.230000E-07
- For very large or very small numbers, SAS can represent the number in scientific notation. The
SAS Numeric Data Types and Precision
- Storage Type: SAS stores numeric data as floating-point numbers, which means numbers are represented in scientific notation, even though they may appear as integers or decimals in the output.
- Precision: SAS uses an 8-byte storage format for numeric values. This allows it to represent numbers with a precision of up to 15 digits. However, it is important to note that numeric values in SAS can sometimes lose precision, especially for very large or very small numbers, due to the limits of floating-point representation.
Date and Time in SAS (Numeric Data)
In SAS, date and time values are also represented numerically:
- Date: SAS represents dates as the number of days since January 1, 1960. For example, the SAS value
21916
represents the date January 1, 2022. - Time: SAS represents times as the number of seconds since midnight of the current day.
- Datetime: A datetime is stored as the number of seconds since January 1, 1960 (similar to how a date is stored), but it also includes the time portion.
Examples:
data example;
date_value = '01JAN2022'd;
time_value = '12:34:56't;
datetime_value = '01JAN2022:12:34:56'dt;
format date_value date9. time_value time8. datetime_value datetime20.;
run;
Output:
01JAN2022 12:34:56 01JAN2022:12:34:56
Handling Missing Numeric Values
- In SAS, missing numeric values are represented by a period (
.
). When a numeric value is missing, SAS will automatically display it as a period in the output. - Special missing values can also be represented, such as:
.A
,.B
,.C
, etc., for custom missing values.- These custom missing values allow you to distinguish between different types of missing data in your analysis.
Converting Numeric Data in SAS
Sometimes, it is necessary to convert numeric data into other data types or formats in SAS. Some common conversions include:
- Numeric to Character:
- The
PUT
function is used to convert numeric data to character data in a specific format.
data example; num = 1234567.89; char_num = put(num, 8.); run;
- The
- Character to Numeric:
- The
INPUT
function is used to convert character data to numeric data.
data example; char_num = '1234567.89'; num = input(char_num, 8.); run;
- The
Conclusion
- In SAS, numeric data refers to any type of data that can be used for mathematical operations and statistical analysis, including integers, decimals, dates, times, and floating-point numbers.
- SAS stores numeric values in an 8-byte (64-bit) floating-point format, providing flexibility for storing a wide range of numbers, but with a precision of up to 15 digits.
- The way numeric values are displayed can be customized using various formats (e.g.,
BEST.
,DOLLAR.
,COMMA.
,E.
, etc.), which control the presentation of numbers in output. - SAS numeric data is versatile, but users must also be mindful of precision limitations and handle missing values properly in their datasets.