Tuesday, 5 April 2016

Arrays in SAS


You must have seen arrays being used in various programming languages and must already know quite a bit about them, but before we go into the details let me tell you one thing : 

ARRAYS IN SAS ARE DIFFERENT FROM ANY OTHER LANGUAGE..!!

In most of the languages array is a data structure, holding data values, but in SAS it is not a data structure, it is just a collective name given to a group of variables. Being clear with this distinction is very integral to the part of using arrays successfully in SAS.

In SAS the most important function of the array is to reduce the lines of code where a programs involves repetitive calculations on different variables. I have seen most programmers shy away from their use, but let me tell you that it is a wonderful tool to make your code simple. Also in some instances arrays provide flexibility to make your code dynamic, I’ll get back to this later.

So let us get down to the dirty details :



What are arrays?

An Array is a grouping of variables of the same type used to perform repetitive operations on those variables.

Syntax:

Array array_name[dimension] $ length variable-list

Array – The ARRAY keyword for decleration

Array_name – Any valid SAS name (Do not use function names)

Dimension – Number of elements in the array(If unknown can use * but then we have to provide the varable list)

$ - Tells that the array is character array.

Length – Length of each variable

Variable list – List of variables to be part of the array(Can use named ranges)

Usage:

We have a dataset with 10 variables(kg1 to kg10) containing weights of a patients measure for 10 consecutive weeks. We want to convert these weight to pounds. One way we can do it is :

data kg_to_lbs;
      set weights;
      lbs1=kg1*2.2;
      lbs2=kg2*2.2;
      lbs3=kg3*2.2;
      ...
      lbs10=kg10*2.2;
run;

But this generates 13 lines of code for one simple calculation. Imagine when we have 100 such measurements, So here we can use arrays to make code shorter and easy to understand:

data kg_to_lbs;
      set weights;
      array kg_array {10} kg1-kg10;
      array lbs_array {10} lbs1-lbs10;
      do i = 1 to 10;
            lbs_array{i} = (kg_array{i})*2.2;
      end;
run;

cool na..and there is nothing complicated about this, you just need to declare an array and use a simple do loop.

Below are a few points about arrays which will be all you need to know about them to utilize arrays to their full potential.

Important points to note about arrays:

1 Variables used in an array must be of the same type. Either all numeric or all character.

2  Variables need not be preexisting variables, if they does not exist then SAS creates them for you. This converts into a useful application of creating variables through arrays.

3  SAS needs to know the size(number of elements) of the array while you are creating it. You can supply the size  in brackets next to the array name 

Array test[10] $5; 
Any array of 10 elements

Or you can let SAS count them for you using the number of variables in the variable list.

Array test1[*] $ var1 – var5

You cannot emit both the dimension and variable list together.

4)  Array does not accept numeric variables in the brackets whose value may define the dimension because it creates the array in compile time and the value of the variable used in brackets will be available only in compile time.

Array test[num_var];
ERROR : Array requires a numeric constant

5)  If we want to use all numeric or all character variables of a dataset without bothering about their names then you can declare the array like :

Array nums[*] _NUMERIC_;
Array nums[*] _CHARACTER_;

6)  Sometimes we need an array to hold values temporarily but do not want to output those variables to the final dataset then we can use temporary arrays. They are declare as :

Array arr_name[10] _TEMPORARY_;

Functions used with arrays :

DIM FUNCTION :

This function helps to determine the number of  elements in a array dynamically so while looping you do not have to hardcode it.
For e.g.

Do i=1 to 10;
A[i]= b[i] + c[i]
End;

Can also be written as :

Do i=1 to dim(a);
A[i]= b[i] + c[i]
End;

VNAME FUNCTION :

We get the values of variables in an array using the array name and subscript but if we want the name of the element(variable) by its subscript then we can use the vname function. For e.g.

array arr_name[*] X Y Z P Q R;
i=3;
vars = Vname(arr_name [i]); 

So vars will be initialized to the name of the third variable in the list which is Z.

OF OPERATOR :

This operator is very useful when we have to perform and operation on all elements(or variables) of an array. For e.g
We need a sum of all elements of an array we can write :
X=sum(a[1],a[2],a[3],a[4],a[5],a[6],a[7]…)

But if we don’t know how many elements are there or it changes every time then you need to update it again and again, so instead we can write

X=sum(of a[*]);

Cool and easy.. :-)


Sunday, 3 April 2016

Convert all Numeric Variables to Character using proc SQL

The task is to create a successor to a SAS data set, replacing each numeric variable in the original with a character variable. Here is a method which preserves variable order and variable attributes. First create a data set for the demonstration:

procsql;
createtable mixedtype as
 select*
  fromsashelp.class;

quit;

Result:

                  Age as
                    of 1
 First             March
 Name      Sex      2008    Height  Weight
 -----------------------------------------
 Alfred    M          14        69  112.50
 Alice     F          13      56.5   84.00
 Barbara   F          13      65.3   98.00
 Carol     F          14      62.8  102.50
 Henry     M          14      63.5  102.50
 James     M          12      57.3   83.00
 Jane      F          12      59.8   84.50
 Janet     F          15      62.5  112.50
 Jeffrey   M          13      62.5   84.00
 John      M          12        59   99.50
 Joyce     F          11      51.3   50.50
 Judy      F          14      64.3   90.00
 Louise    F          12      56.3   77.00
 Mary      F          15      66.5  112.00
 Philip    M          16        72  150.00
 Robert    M          12      64.8  128.00
 Ronald    M          15        67  133.00
 Thomas    M          11      57.5   85.00
 William   M          15      66.5  112.00

Next, this somewhat tricky query which uses DICTIONARY.COLUMNS as a source of metadata and creates the bulk of another query:

procsql;
selectcase type
        when'num'  thencatx (   ' '
                                , 'left( put ('
                                , name
                                , ','
                                , case
                                   whenformat is null then 'best12.'
                                   elseformat
                                   end
                                , ') ) as'
                                , name
                                , 'label ='
                                , quote( strip(label) )                              
                              )
        when'char' then name
        else             catx (   ' '
                                , quote('Error on type')
                                , 'as'
                                , name
                              )
        end
 into: selections separated by ' , '
 fromdictionary.columns
 wherelibname='WORK' andmemname='MIXEDTYPE';


To adapt for other tables change only the literals in the WHERE clause.

Result:

 Name
 Sex
 left( put ( Age , best12. ) ) as Age label = "Age as of 1 March 2008"
 left( put ( Height , best12. ) ) as Height label = ""
 left( put ( Weight , 6.2 ) ) as Weight label = ""

The INTO clause takes these expressions and concatenates them into a comma-separated string which it stores in the macro variable SELECTIONS, which in turn is used in the statement which actually does the work:

procsql;
createtable allchar asselect &selections frommixedtype;

quit;

Result:

 First          Age as of 1
 Name      Sex  March 2008    Height        Weight
 -------------------------------------------------
 Alfred    M    14            69            112.50
 Alice     F    13            56.5          84.00
 Barbara   F    13            65.3          98.00
 Carol     F    14            62.8          102.50
 Henry     M    14            63.5          102.50
 James     M    12            57.3          83.00
 Jane      F    12            59.8          84.50
 Janet     F    15            62.5          112.50
 Jeffrey   M    13            62.5          84.00
 John      M    12            59            99.50
 Joyce     F    11            51.3          50.50
 Judy      F    14            64.3          90.00
 Louise    F    12            56.3          77.00
 Mary      F    15            66.5          112.00
 Philip    M    16            72            150.00
 Robert    M    12            64.8          128.00
 Ronald    M    15            67            133.00
 Thomas    M    11            57.5          85.00
 William   M    15            66.5          112.00

Another query against DICTIONARY.COLUMNS compares the attributes of the original and replacement variables:

 select   name    format=$10.
        , memname format=$10.
        , varnum  format=10.
        , type    format=$8.
        , length  format=6.
        , format  format=$8.
        , label   format=$30.
  fromdictionary.columns
  wherelibname='WORK' andmemname in ('ALLCHAR','MIXEDTYPE')
  orderby varnum, memname desc;

 quit;

Result:

                             Column
 Column      Member       Number in  Column    Column  Column
 Name        Name             Table  Type      Length  Format    Column Label
 --------------------------------------------------------------------------------------
 Name        MIXEDTYPE            1  char           8            First Name
 Name        ALLCHAR              1  char           8            First Name
 Sex         MIXEDTYPE            2  char           1  $1.
 Sex         ALLCHAR              2  char           1  $1.
 Age         MIXEDTYPE            3  num            8            Age as of 1 March 2008
 Age         ALLCHAR              3  char          12            Age as of 1 March 2008
 Height      MIXEDTYPE            4  num            8
 Height      ALLCHAR              4  char          12
 Weight      MIXEDTYPE            5  num            8  6.2
 Weight      ALLCHAR              5  char           6

This demonstrates that variable order and variable attributes are preserved.

Keyboard Shortcuts within the SAS Main Window




Category Key Combination Action
Dialog Boxes and Entry Fields Tab move to next field
Shift + Tab move to previous field
Navigate around Text
Ctrl + -> (right arrow) move to next word
Ctrl + <- (left arrow) move to previous word
Home move to beginning of line
End move to end of line
Ctrl + Home move to top
Ctrl + End move to bottom
Page Up page up
Page Down page down
Ctrl + Page Up move to top
Ctrl + Page Down move to bottom
Ctrl + Tab navigate to the next open SAS window (NEXTWIND command)
Ctrl + Shift + Tab navigate to the previous open SAS window (PREVWIND command)
Mark Text Shift + -> (right arrow) mark while going to the right
Shift + <- (left arrow) mark while going to the left
Shift + Home mark to beginning of line
Shift + End mark to end of line
Shift + Ctrl + Home mark to top
Shift + Ctrl + End mark to bottom
Shift + Page Up page up and mark
Shift + Page Down page down and mark
Shift + Ctrl + Page Up mark to top
Shift + Ctrl + Page Down mark to bottom
Shift + MB1 extend the current marked text selection to the click position
Cut, Copy, and Paste Delete delete the next character (or marked text)
Ctrl + Delete delete from the insertion point position to the end of the current word
Ctrl + Backspace delete from the insertion point position to the start of the current word
Ctrl + MB1 selects the entire line (clicked line)
Ctrl + Z undo previous action
Ctrl + X cut selected text
Ctrl + C copy selected text to paste buffer
Ctrl + V paste text
Window Control Alt switch focus to or from the main menu bar
Shift + F5 cascade the windows
Shift + F4 tile the windows vertically
Shift + F3 tile the windows horizontally
Ctrl + F6 next window
Alt + F4 exit SAS
Ctrl + F4 close the active window
Ctrl + W access new SAS explorer window
Shift + F10 open pop-up menu
Resizing the Docking View Alt + W + S start docking view resizing
-> (right arrow) move the split bar a small amount to the right
<- (left arrow) move the split bar a small amount to the right
Ctrl + -> (right arrow) move the split bar a larger amount to the right
Ctrl + <- (left arrow) move the split bar a larger amount to the left
Home move the split bar all the way to the left
End move the split bar all the way to the right
Return accept the current size of the docking view and exit docking view resizing
Esc end docking view resizing without resizing the docking view
Miscellaneous
Alt + Enter open the Properties dialog box for a selected object
This command is valid only in a Tree view or a List view.
Esc + letter (or number) color or highlighting attributes in NOTEPAD window


for more   click here