Monday 18 October 2010

Unique value row count in SAS

You may have come across the need to count unique values in your dataset and put that as a row in your dataset/table - something like this:

id, name
1, Alan
2, Brad
2, Brad
3, David
3, David
3, David
4, George
5, Joe
6, Steven
7, Zed
7, Zed

The solution involves using the LAG() function. Lag is like a look-back function and for each row that is processed it looks back a row and fetches the value. So, if I were at observation (row) 2 and did LAG(name), it would return the value of "Alan". The retain function puts an initial temporary value in a variable so I can use it in my processing.

Anyway, here's the code:

data pupils;
    input name $;
    datalines;
Alan
Brad
Brad

David
David
David
George
Joe
Steven
Zed
Zed
;
run;


data store;
    set pupils;

    prevname = lag(name);

    format id BEST12.;
    retain id 0;
     if name^=prevname then id = id + 1;

    drop prevname;
run;

No comments:

Post a Comment