Wednesday, 14 April 2010

Some SAS Snippets

Don't show the current date on reports:

options nodate;

To load/import SAS datasets (sas7bdat files) you tell SAS about the library (folder) which they're in:

libname Morty "Z:\path\to\Dummy Data\";

This will create a library icon that will show up on the left in the library explorer. I want to copy the files to my work folder using more manageable names:

data Dec09;
    set Morty.cybdec09;
run;
data Jan10;
    set Morty.cybjan10;
run;
data Feb10;
    set Morty.cybfeb10;
run;

Next, I plan to merge the datasets into a single table by a mutual variable (column) name, Account_no, but I need to sort the datasets by that before merging.

proc sort data=Dec09;
    by Account_no;
run;
proc sort data=Jan10;
    by Account_no;
run;
proc sort data=Feb10;
    by Account_no;
run;


data allmort;
    merge Dec09 Jan10 Feb10;
    by Account_no;
run;

What if I want to insert Datalines as CSV (delimited by commas) by hand instead of the horrible default manual space-padded SAS format:

data trydel;
    input usrname $ finger_count;
    infile datalines delimiter=",";
    datalines;
Adam, 10
Bob, 11
Chris, 9
Dan, 10
Eddy, 12
Fred, 10
Gary, 7
Hugh, 5
Ian, 10
Joe, 10
Kev, 10
Lee, 10
Mike, 9
Norm, 10
Otis, 8
Pat, 9
;
run;
 
Sort that data by a variable (column):
 
proc sort data=trydel;
    title "Everybody by finger count";
    by DESCENDING finger_count DESCENDING usrname;
run;

SAS proc sort-by arguments are not seperated with commas and you should type DESCENDING in full and before the variable (column) name, unlike SQL where DESC and ASC are sufficient as acronyms and it comes after the variable name. Also the SAS keyword ASCENDING doesn't exist! It sorts ascending by default.

Make a standard report without SQL:

proc print data=trydel noobs;
    title "More than 8 Fingers";
    where finger_count > 8;
run;

"Noobs" (NO OBServation count) makes sure that the SAS report doesn't show a row count next to the data.

Now use SQL to get the data you need:

proc sql;
    title "SQL Where Names gt Bob";
    SELECT usrname, finger_count format=z2.
    FROM Work.trydel WHERE usrname > "Bob"
    ORDER BY finger_count DESC, usrname DESC;
quit;

Note that the library name Work (or wherever your dataset is), has to be explicitly named before the dataset (table) name, like so: Work.dataset. (Library.Table).
Also note the "format=z2." part. It formats the number with leading zeros, so instead of showing 7, 9, 11, 10, it will show 07, 09, 11, 10. Of course, you need to find out how many digits to set, so if you wanted to show more leading zeros, put a larger number after z.

Another thing that is different is the string length checking SQL function, often named STRLEN in other diallects. In SAS it's just "LENGTH". Usage: SELECT * FROM table WHERE LENGTH(usrname) > 2.

Other useful Proc SQL functions include IS NULL / IS NOT NULL and IS MISSING / IS NOT MISSING.

This is a good resource: http://en.wikiversity.org/wiki/Data_Analysis_using_the_SAS_Language

No comments:

Post a Comment