tag:blogger.com,1999:blog-50556960899920836022024-03-05T11:17:14.870-08:00SAS on UNIX@hsphUsing Statistical program SAS on UNIX operating system. Directed towards Harvard University School of Public Health students and researchers but should be useful to many.Unknownnoreply@blogger.comBlogger44125tag:blogger.com,1999:blog-5055696089992083602.post-18919796694440673132011-03-14T17:33:00.000-07:002011-03-14T17:34:28.476-07:00direct filename with path to a filedir /s/b>list.txtUnknownnoreply@blogger.com34tag:blogger.com,1999:blog-5055696089992083602.post-37059998122113436472011-03-14T08:49:00.001-07:002011-03-14T08:49:40.435-07:00Regular expression wildcard.+<div><br /></div><div><a href="http://www.ats.ucla.edu/stat/sas/code/perl_wildcard.htm">http://www.ats.ucla.edu/stat/sas/code/perl_wildcard.htm</a></div>Unknownnoreply@blogger.com2tag:blogger.com,1999:blog-5055696089992083602.post-58993316927211465022011-03-10T12:36:00.000-08:002011-03-10T12:55:57.187-08:00Regular expressions1. Search and extract<div>a. Use Prxmatch before Prxposn<div><br /></div><div><br /></div><div>2. Remove whitespace </div><div><br /></div><div><br /></div><div>readhtml</div><div>xls2csv to speed up the reading</div><div><br /></div><div><br /></div></div>Unknownnoreply@blogger.com3tag:blogger.com,1999:blog-5055696089992083602.post-71693173605066099412008-05-08T08:07:00.000-07:002008-05-08T08:12:39.469-07:00Create a variable having value of median in categoriesTo find the median values of the categories (e.g. to assess test of trend)<br /><br /><br />%macro hint(var1,var2);<br /> <br />proc means data=fret median;<br /> class &var1;<br /> var &var2;<br /> OUTPUT OUT=&var2 MEDIAN= ;<br />run;<br /><br /> data null;<br />set &var2;<br />call symputx("&var2.m"||put(_n_,1.),&var2,'g');<br />run;<br />%put &var2.m;<br />%put _user_;<br />%mend;<br />**depending on the number of categories (n), you would have n+1 macro variables.<br />**********run like this *************;<br />%hint(_alcoc,alc) /***_alcoc has 4 categories **/<br /><br />/*** this would create macro variables as &alcm2 - &alcm(n+1)***/<br /><br />data new;<br /> set fret;<br /><br /> if _alcoc =. then alcm = .;<br />else if _alcoc =1 then alcm=&alcm2; /**note this is &alcm2 ***/<br />else if _alcoc =2 then alcm=&alcm3;<br />else if _alcoc= 3 then alcm=&alcm4;<br />else if _alcoc = 4 then alcm=&alcm5;Unknownnoreply@blogger.com3tag:blogger.com,1999:blog-5055696089992083602.post-76488662705688190952008-03-27T11:32:00.000-07:002008-03-27T11:46:26.438-07:00Coding interactions/effect modification to get Confidence Intervals<span style="font-weight: bold;">/************ THIS IS 4 X 3 interaction *************/</span><br /><br /><br />VARA0VARB0=0; VARA0VARB1=0;VARA0VARB2=0; VARA0VARB3=0;<br />VARA1VARB0=0; VARA1VARB1=0;VARA1VARB2=0; VARA1VARB3=0;<br />VARA2VARB0=0; VARA2VARB1=0;VARA2VARB2=0; VARA2VARB3=0;<br />VARA3VARB0=0; VARA3VARB1=0;VARA3VARB2=0; VARA3VARB3=0;<br /><br />IF VARA=0 THEN DO;<br /> IF VARB=0 THEN VARA0VARB0=0;<br /> IF VARB=1 THEN VARA0VARB1=1;<br /> IF VARB=2 THEN VARA0VARB2=2;<br /> IF VARB=3 THEN VARA0VARB3=3;<br />END;<br /><br />IF VARA=1 THEN DO;<br /> IF VARB=0 THEN VARA1VARB0=0;<br /> IF VARB=1 THEN VARA1VARB1=1;<br /> IF VARB=2 THEN VARA1VARB2=2;<br /> IF VARB=3 THEN VARA1VARB3=3;<br />END;<br /><br />IF VARA=2 THEN DO;<br /> IF VARB=0 THEN VARA2VARB0=0;<br /> IF VARB=1 THEN VARA2VARB1=1;<br /> IF VARB=2 THEN VARA2VARB2=2;<br /> IF VARB=3 THEN VARA2VARB3=3;<br />END;<br /><br />IF VARA=3 THEN DO;<br /> IF VARB=0 THEN VARA3VARB0=0;<br /> IF VARB=1 THEN VARA3VARB1=1;<br /> IF VARB=2 THEN VARA3VARB2=2;<br /> IF VARB=3 THEN VARA3VARB3=3;<br />END;<br /><br />%LET VARB_VARAINT=<br /> VARA0VARB1 VARA0VARB2 VARA0VARB3<br />VARA1VARB0 VARA1VARB1 VARA1VARB2 VARA1VARB3<br />VARA2VARB0 VARA2VARB1 VARA2VARB2 VARA2VARB3<br />VARA3VARB0 VARA3VARB1 VARA3VARB2 VARA3VARB3;<br /><br /><span style="color: rgb(0, 0, 153);">/* To find the point estimates and confidence intervals of each level of VARA with each level of VARB just enter the &VARB_VARAINT</span><br /><span style="color: rgb(0, 0, 153);">in the models */</span><br /><br /><span style="color: rgb(0, 0, 153);">/* you can tailor the code with find and replace in SAS/Xemacs/Word/Textpad. */</span><br /><br /><br /><span style="font-weight: bold;">/************ THIS IS 4 X 4 interaction *************/</span><br /><br /><br />VARA0VARB0=0; VARA0VARB1=0; VARA0VARB2=0; VARA0VARB3=0; /* 1ST QUINTILE*/<br />VARA1VARB0=0; VARA1VARB1=0; VARA1VARB2=0; VARA1VARB3=0; /* 2ND QUINTILE*/<br />VARA2VARB0=0; VARA2VARB1=0; VARA2VARB2=0; VARA2VARB3=0; /* 3RD QUINTILE */<br />VARA3VARB0=0; VARA3VARB1=0; VARA3VARB2=0; VARA3VARB3=0; /* 4TH QUINTILE */<br /><br />IF VARA=0 then do;<br />if VARB=0 then VARA0VARB0=1;<br />if VARB=1 then VARA0VARB1=1;<br />if VARB=2 then VARA0VARB2=1;<br />if VARB=3 then VARA0VARB3=1;<br />END;<br /><br />IF VARA=1 then do;<br />if VARB=0 then VARA1VARB0=1;<br />if VARB=1 then VARA1VARB1=1;<br />if VARB=2 then VARA1VARB2=1;<br />if VARB=3 then VARA1VARB3=1;<br />END;<br /><br />IF VARA=2 then do;<br />if VARB=0 then VARA2VARB0=1;<br />if VARB=1 then VARA2VARB1=1;<br />if VARB=2 then VARA2VARB2=1;<br />if VARB=3 then VARA2VARB3=1;<br />END;<br /><br /><br />IF VARA=3 then do;<br />if VARB=0 then VARA3VARB0=1;<br />if VARB=1 then VARA3VARB1=1;<br />if VARB=2 then VARA3VARB2=1;<br />if VARB=3 then VARA3VARB3=1;<br />END;<br /><br /><br />%LET VARA_VARBINT=<br />VARA0VARB1 VARA0VARB2 VARA0VARB3<br />VARA1VARB0 VARA1VARB1 VARA1VARB2 VARA1VARB3<br />VARA2VARB0 VARA2VARB1 VARA2VARB2 VARA2VARB3<br />VARA3VARB0 VARA3VARB1 VARA3VARB2 VARA3VARB3 ;Unknownnoreply@blogger.com228tag:blogger.com,1999:blog-5055696089992083602.post-59338380082936425752008-02-01T17:43:00.000-08:002008-02-01T17:45:21.048-08:00Use a sample to hasten preliminary analysisI have used following ways;<br /><br />**************************<br />proc surveyselect data=onenn method=srs n=10000 out=onen;<br />run;<br /><br />**************************<br /> data onen;<br /> merge<br /> fa7684 fa8694 fa9600 nur92 nur94 n94_dt<br /> temp(in=mstr) nur96 nur98 act8600<br /> nur82 nur88 n84_dt n86_dt n90_dt<br /> fileb n767880 meddata<br /> temp db7602<br /> fatalmi mi stroke anginew<br /> fatalstk deadff2004 pact spact end=_end_;<br /> <br /> by id;<br /> exrec=1;<br /> if first.id and mstr then exrec=0; /*** mas = master file ie. n80_cf ***/<br /> if famdb82=1 then famdb88=1;<br /> else famdb88=0;<br /> random=RANUNI(-1); /* GENERATE A RANDOM VECTOR */<br />%let k=5000; <br />run;<br /><br />PROC SORT DATA=onen;<br /> BY random; /* SORT OBSERVATIONS BY THE RANDOM VECTOR */<br />run;<br /><br />DATA onensample;<br /> SET onen(drop=random);<br /> IF _N_ le &k; /* SELECT THE FIRST K OBSERVATIONS */<br /> /*both magneUnknownnoreply@blogger.com1tag:blogger.com,1999:blog-5055696089992083602.post-55179478571634089652007-06-27T04:52:00.000-07:002007-06-27T05:01:13.479-07:00Concatenating sas macro variables<span style="font-family: times new roman;font-size:100%;" >I always forget how to do it.<br /><span style="font-weight: bold;"><br /></span>%let nagasuchi=cases;<br />data new;<br /> set library.old&nagasuchi;<br /><br /> This would be read as<br /> data new;<br /> set library.oldcases;<br /><br /><br />If the macro variable is a prefix<br /><br />data new;<br />set library.&nagasuchi.old;<br />/* note the period*/<br /><br />This would be read as<br />data new;<br />set library.casesold;<br /><br />If the character following a macro variable is a period, then you need to use two periods.<br />set in&nagasuchi..select;<br /><br />After resolution, SAS would read this as SET in<b>cases.select</b>;</span> <br /><br /><br />More on this <a href="http://www.caspur.it/risorse/softappl/doc/sas_docs/macro/z1071889.htm">here</a>.Unknownnoreply@blogger.com79tag:blogger.com,1999:blog-5055696089992083602.post-38488046590662401302007-06-15T19:35:00.000-07:002007-06-15T19:45:03.199-07:00LaTeX output in sas/*Use one of the following ods statements */<br />/* Legacy LaTeX for ODS */<br />ods tagsets.latex file="legacy.tex";<br /><br />/* Legacy LaTeX with color for ODS */<br />ods tagsets.colorlatex file="color.tex" stylesheet="sas.sty"(url="sas");<br /><br />/* Simplified LaTeX output that uses plain LaTeX tables */<br />ods tagsets.simplelatex file="simple.tex" stylesheet="sas.sty"(url="sas");<br /><br />/* Same as above, but only prints out tables (no titles, notes, etc.) */<br />/* Also, prints each table to a separate file */<br />ods tagsets.tablesonlylatex file="C:\Documents and Settings\mk\My Documents\tablesonly.tex" (notop nobot) newfile=table;<br /><br />proc reg data=sashelp.class;<br /> model Weight = Height Age;<br />run;quit;<br /><br />/*Use one of the following ods statements corresponding to open statements*/<br /><br />ods tagsets.latex close;<br />ods tagsets.colorlatex close;<br />ods tagsets.tablesonlylatex close;<br />ods tagsets.simplelatex close;<br /><br />/*from <a href="http://support.sas.com/rnd/base/topics/odsmarkup/latex.html">SAS*/</a>Unknownnoreply@blogger.com3tag:blogger.com,1999:blog-5055696089992083602.post-22931472680352502362007-05-03T22:34:00.000-07:002007-05-03T22:40:24.826-07:00Unix Banner for interesting titlesASCII Banner <a href="http://www.angelfire.com/dc/apurvb/asc/bottom.html">1</a> , <a href="http://www.network-science.de/ascii/">2</a><br /><br /><p><span style="color: rgb(0, 0, 0);"><br /><table bg border="0" width="100%" style="color:#ffffff;"><tbody><tr><td><pre><span style="color: rgb(102, 0, 0); font-weight: bold;"> .'|. . '|| </span><br /><span style="color: rgb(102, 0, 0); font-weight: bold;">.||. .... .||. .... .... .. ... .. || </span><br /><span style="color: rgb(102, 0, 0); font-weight: bold;"> || '' .|| || ||. ' '' .|| || || .' '|| </span><br /><span style="color: rgb(102, 0, 0); font-weight: bold;"> || .|' || || . '|.. .|' || || || |. || </span><br /><span style="color: rgb(102, 0, 0); font-weight: bold;">.||. '|..'|' '|.' |'..|' '|..'|' .||. ||. '|..'||. </span><br /><span style="color: rgb(102, 0, 0); font-weight: bold;"> </span><br /><span style="color: rgb(102, 0, 0); font-weight: bold;"> </span><br /><span style="color: rgb(102, 0, 0); font-weight: bold;"> || </span><br /><span style="color: rgb(102, 0, 0); font-weight: bold;"> .... .. ... ... . ... .. ... .... </span><br /><span style="color: rgb(102, 0, 0); font-weight: bold;">'' .|| || || || || || || || '' .|| </span><br /><span style="color: rgb(102, 0, 0); font-weight: bold;">.|' || || || |'' || || || .|' || </span><br /><span style="color: rgb(102, 0, 0); font-weight: bold;">'|..'|' .||. ||. '||||. .||. .||. ||. '|..'|' </span><br /><span style="color: rgb(102, 0, 0); font-weight: bold;"> .|....' </span><br /><br /><br /><span style="color:#000000;"><pre> __ _ _ _ <br />/ _| __ _| |_ ___ __ _ _ __ __| | __ _ _ __ __ _(_)_ __ __ _<br />| |_ / _` | __/ __| / _` | '_ \ / _` | / _` | '_ \ / _` | | '_ \ / _` |<br />| _| (_| | |_\__ \ | (_| | | | | (_| | | (_| | | | | (_| | | | | | (_| |<br />|_| \__,_|\__|___/ \__,_|_| |_|\__,_| \__,_|_| |_|\__, |_|_| |_|\__,_|</pre></span> </pre><!-- white text and background :) /!--> <span style="color: rgb(255, 255, 255);">Have fun. DonĀ“t forget to bookmark this website :)</span> </td></tr></tbody></table> <hr size="1"> </span> <table bgcolor="#ffffff" border="0" width="100%"> <tbody><tr><td> <script type="text/javascript"><!-- google_ad_client = "pub-5034720518708375"; google_ad_width = 728; google_ad_height = 90; google_ad_format = "728x90_as"; google_ad_type = "text"; google_ad_channel =""; google_page_url = document.location; google_color_border = "FFFFFF"; google_color_bg = "FFFFFF"; google_color_link = "0000FF"; google_color_url = "008000"; google_color_text = "000000"; //--></script> <script type="text/javascript" src="http://pagead2.googlesyndication.com/pagead/show_ads.js"> </script><br /></td></tr> </tbody></table> </p>Unknownnoreply@blogger.com1tag:blogger.com,1999:blog-5055696089992083602.post-48858464828203340722007-04-19T18:59:00.000-07:002007-04-19T19:27:30.503-07:00Make results more presentableSAS regression output requires additional steps to make it presentable. In previous posts, I have highlighted how to only get the results for variables of interest. However, these results are in row format<br /><br />Obs F1 or lci uci p_value<br /><br /> 1 exp1 1.202 1.110 1.34 .001<br /> 2 exp2 1.340 1.202 1.56 .001<br /> 3 exp3 1.560 1.340 1.89 .001<br /> 4 exp4 1.890 1.560 1.98 .001<br /><br /><br /><br />This output need to be further transposed in Excel to get the results in following format.<br /><br /> Obs exp0 exp1 exp2 exp3 exp4<br /><br /> 1 1 1.2020 1.3400 1.5600 1.8900<br /> 2 1.11,1.34 1.202,1.56 1.34,1.89 1.56,1.98<br /> 3 0.0010 0.0010 0.0010 0.0010<br /><br /><br />The following program eliminates that<br /><br />proc import datafile="C:\Documents and Settings\mkaushik\Desktop\Output results.xls" out=auto replace;<br />run;<br />data inter /* / view=intermediate*/;<br /> set auto;<br />orc= put(or,6.4);<br /> pvaluec= put(p_value,6.4);<br />new=compress(lci||','||uci);<br /> output; *output the input record;<br />if _n_=1 then do;<br />F1='exp0';<br />or=1; *set values for your added obs;<br />lci=.;uci=.;p_value=.;<br />orc="1";<br />pvaluec=" ";<br />new=" ";<br />output; *output your added obs;<br />end;<br />proc sort;<br />by F1;<br />run;<br />data inter;<br />set inter;<br />array Value (*) orc new pvaluec;<br />do id =1 to 3;<br />_value_ = value [id]; * since first numeric is that date;<br />F1=F1;<br />output;<br />end;<br />drop or p_value _name_;<br />run;<br />proc print data=inter;<br />run;<br /><br />proc sort data=inter;<br />by id;<br />run;<br />proc transpose data=inter out=outset(drop=id _name_) ;<br /> by id ;<br /> id F1 ;<br /> var _value_;<br />run;<br />proc print data=outset;<br />run;Unknownnoreply@blogger.com2tag:blogger.com,1999:blog-5055696089992083602.post-91254145984690783122007-04-05T12:57:00.000-07:002007-04-05T16:10:29.185-07:00Correspondence between genmod and logisticI have been exploring this for my work. I found some guidance <a href="http://cc.uoregon.edu/cnews/spring2005/saslogistic.htm">here</a>. I am including some details here.<br /><span style="font-size:85%;">data drug;<br />input drug$ x r n;<br />cards;<br />A .1 1 10<br />A .23 2 12<br />A .67 1 9<br />B .2 3 13<br />B .3 4 15<br />B .45 5 16<br />B .78 5 13<br />C .04 0 10<br />C .15 0 11<br />C .56 1 12<br />C .7 2 12<br />D .34 5 10<br />D .6 5 9<br />D .7 8 10<br />E .2 12 20<br />E .34 15 20<br />E .56 13 15<br />E .8 17 20<br />;<br /><br />proc genmod data=drug;<br />class drug;<br />model r/n=x drug / dist=binomial link=logit;<br />estimate 'A vs E' drug 1 0 0 0 -1/exp;<br />run;<br /><br />proc logistic data=drug;<br />class drug/<span style="font-weight: bold;">param=ref</span>;<br />model r/n=x drug;<br />run;</span><br /><span style="font-size:85%;"><br />/* generates dummy variables coded as follows<br /> drug A 1 0 0 0<br /> B 0 1 0 0<br /> C 0 0 1 0<br /> D 0 0 0 1<br /> E 0 0 0 0 */<br /><br /><br /></span><span style="font-size:85%;"> proc logistic data=drug;<br />class drug;<br />model r/n=x drug;<br />run;<br /><br /><br />/* generates dummy variables coded as follows<br /><br /> Class Value Design Variables<br /><br /> drug A 1 0 0 0<br /> B 0 1 0 0<br /> C 0 0 1 0<br /> D 0 0 0 1<br /> E -1 -1 -1 -1<br />*/</span><br /><br /><br />However, the results (Odds ratio) are going to be the same. Accessible help on writing contrast statements is <a href="http://www.google.com/url?sa=t&ct=res&cd=1&url=https%3A%2F%2Fwww.atlas.uiuc.edu%2Fdata_stats%2Fresources%2Fsas%2FSAS_SPSS-contrasts.pdf&ei=b4EVRrbSM43IggSj_dHJCw&usg=__qYXmvewr15GSajrzj2m1veukp9g=&sig2=G1qoiTKHYoSWTdrAN3u0mg">here</a>.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-5055696089992083602.post-33434530683000500642007-04-02T16:42:00.000-07:002007-04-03T07:48:25.698-07:00Proc tabulate/proc reportI have been trying to learn these new tools.<br /><br />Here is a <a href="http://www.laurenhaworth.com/publications/71-28.pdf">pdf</a> with clear instructions and uses of both.<br /><span style="font-size:78%;"><br />Some available options are ACROSS, ANALYSIS, CENTER, COLOR, COMPUTED, CSS, CV, DESCENDING, DISPLAY, EXCLUSIVE, F, FLOW, FORMAT, GROUP, ID, ITEMHELP, LEFT, MAX, MEAN, MEDIAN, MIN, MISSING, N, NMISS, NOPRINT, NOZERO, ORDER, P1, P10, P25, P5, P50, P75, P90, P95, P99, PAGE, PCTN, PCTSUM, PRELOADFMT, PRT, Q1, Q3, QRANGE, RANGE, RIGHT, SPACING, STD, STDERR, STYLE, SUM, SUMWGT, T, USS, VAR, WEIGHT, WGT, WIDTH.</span>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-5055696089992083602.post-35482996936410203542007-04-01T19:00:00.000-07:002007-04-01T19:06:37.608-07:00Macro to output only relevant results to ods<span style="font-size:85%;">/** following macro runs logistic regression and outputs the results from only the relevant variables to the html file. In this case, I am running regression with various variables but keeping age in all the models. Age is not the variable of interest. */<br /><br />%macro jncht(dataname,var1);<br />title "Age and sex adjusted &var1";<br />ods select OddsRatio ParameterEstimates;<br />proc logistic data=&dataname ;<br />model jncht=&var1 ahage ahsex ;<br />ods output OddsRatios=orrr;<br />ods output ParameterEstimates=Param;<br />run;<br /><br />data param;<br />set param;<br />drop DF Estimate StdErr;<br />run;<br /><br />proc sort; <br />by variable;<br /><br /> data orrr;<br /> set orrr;<br /> variable=effect;<br /> drop effect;<br />proc sort;<br />by variable;<br />run;<br /><br /> data new;<br /> merge param orrr;<br /> by variable;<br /> run;<br />%let cuts=%SUBSTR(&var1,1,4);<br /><br />ods html select all;<br />title " age and sex adjusted &var1";<br /> proc print data=new;<br /> var variable OddsRatioEst LowerCL UpperCL ProbChiSq;<br />where variable like "%NRBQUOTE(%)&cuts%NRBQUOTE(%)";<br /> run;<br />ods html exclude all;<br />proc datasets;<br />delete param orrr new;<br />%mend ;<br /><br /><span style="font-weight: bold;">Invocation of this macro.</span><br />/*include the following statement in the beginning of the program*/<br />ods html file ="%sysfunc<br />(reverse(%sysfunc(substr(%sysfunc(reverse(%sysfunc(reverse(%scan(%sysfunc(reverse(%sysfunc(getoption(sysin)))),1,/))))),5)))).html"<br />STYLE=MINIMAL;<br /><br />data...;<br />.<br />.<br />.;<br />%jncht(trott1,&wstci_)<br /><br />/*Where &wstci_ is a macro variable and equals 'wstci1 wstci2 wstci3'.<span style="font-weight: bold;"> */<br /><br /></span></span>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-5055696089992083602.post-39289248775984575922007-04-01T18:46:00.000-07:002007-04-01T19:09:10.250-07:00Wildcards in different situations<span style="font-family: courier new;font-size:85%;" >This is a broad area to handle in one posting but here are few links that I found useful:<br /><br /></span><h4 style="font-weight: normal; font-family: courier new;"><span style="font-size:85%;"><a href="http://www.ats.ucla.edu/stat/sas/code/perl_wildcard.htm"> Matching with a wildcard using Perl regular expression </a>A possible problem with this method is inability of this method to handle macro variables.</span></h4><h4 style="font-weight: normal; font-family: courier new;"><span style="font-size:85%;"><a href="http://books.google.com/books?vid=ISBN1555446817&id=wubfWLLuyI0C&amp;pg=PA63&lpg=PA63&ots=EaOgdSOucm&dq=sas+wildcard+like&sig=yXOYdx9sbco0A_bm2KAal7HhzAE">Matching with like and percent (%)</a> It is used in the next post.</span></h4><h4 style="font-weight: normal;"><span style="font-size:85%;"><a style="font-family: courier new;" href="http://www.stat.ncsu.edu/sas/samples/base/manyfiles.html">Using wildcards to read many files into one SAS data set</a></span><br /></h4>Unknownnoreply@blogger.com1tag:blogger.com,1999:blog-5055696089992083602.post-15002405551601441412007-03-31T09:20:00.000-07:002007-04-01T19:12:43.571-07:00Interactions in logistic regression using proc genmod<span style="font-size:85%;">I have been trying to do logistic regression with interactions. Since this would have required a lot of dummy coding in proc logistic, I used <a href="http://www2.stat.unibo.it/ManualiSas/stat/chap29.pdf">proc genmod</a>.<br /><br />proc genmod; <br />class ahsex(ref=first) ah66a qtype &smok_ &alco_; <br />model jncht= wstrs wstrs*ahsex ahsex ah66a ahage &smok_ &alco_ WLTHINDF QTYPE/ error=bin link=logit type3;<br />run; <br /><br />Another way to encode the interaction term and the main effects would be using "</span><span style="font-size:85%;">wstrs|ahsex". This equals </span><span style="font-size:85%;">"wstrs wstrs*ahsex ahsex".<br /></span><span style="font-size:85%;">The ref option could be replaced with (ref="1") . Specifying more than one REF= variable option in the CLASS statement</span><span style="font-size:85%;"><span style="font-family:courier new;"> could be a <a href="http://support.sas.com/techsup/unotes/SN/013/013510.html">problem</a>.</span><br /><br />In case you are wondering, </span><span style="font-size:85%;">&smok_ refer to the macro variable I created earlier in the code as follows:<br />%let &sm0k= smok1 smok2 smok3 smokm;<br /><br /><br /><a href="http://support.sas.com/techsup/unotes/SN/013/013510.html" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)"><br /></a></span>Unknownnoreply@blogger.com4tag:blogger.com,1999:blog-5055696089992083602.post-16222997700962425692007-03-21T09:04:00.000-07:002007-03-21T09:07:31.733-07:00Find the number of observations in the datasetI work with huge daatsets - more than 100,000 observations followed up for 20 years. In the process of merging datasets read in through macros, I have lost track of the number of observations.<br /><br />Following code gives the number of observations in the dataset - dataname:<br />%let dsid=%sysfunc(open(dataname));<br />%let num=%sysfunc(attrn(&dsid,nobs));<br />%let rc=%sysfunc(close(&dsid));<br />%put There are &num observations in dataset dataname.;<br /><br /><span style="font-size:78%;"><br />This is from the <a href="http://support.sas.com/ctx/samples/index.jsp?sid=592">SAS samples</a>.</span>Unknownnoreply@blogger.com6tag:blogger.com,1999:blog-5055696089992083602.post-37232928300231588652007-03-19T19:29:00.000-07:002007-03-21T14:43:53.772-07:00Centering around mean or calculating standard deviation<span style="font-size:85%;"><span style="font-family:courier new;">data original;<br /></span><span style="font-family:courier new;">set original;</span><br /><span style="font-family:courier new;">var=1; /*creates a constant variable*/</span><br /><span style="font-family:courier new;"><br />/*creates means ,standard deviation and no of obs and puts them in dataset called starwars which has only one observation*/</span><br /><span style="font-family:courier new;">proc means data=original;</span><span style="font-family:courier new;"><br />var ahbmi ah98 ah99 ah9900 ;</span><br /><span style="font-family:courier new;">OUTPUT OUT=starwars MEAN=avbmi av98 av99 STD=stbmi stah98 stah99 N=nbmi n98 n99 ;<br /></span><span style="font-family:courier new;">run;</span><br /><br /><span style="font-family:courier new;"><br />data starwars;</span><br /><span style="font-family:courier new;">set starwars;</span><br /><span style="font-family:courier new;">var=1; /*creates constant variable for merging with original dataset*/</span><br /><span style="font-family:courier new;">drop _freq_ _type_;</span><br /><br /><span style="font-family:courier new;">data original ;</span><br /><span style="font-family:courier new;">merge original starwars;</span><br /><span style="font-family:courier new;">by var;</span><br /><span style="font-family:courier new;"><br />centerbmi=ahbmi-avbmi; /*centers bmi*/</span><br /><span style="font-family:courier new;">bmisd=ahbmi/stbmi;/*creates variable to do regression with each unit increment of standard deviation*/</span><br /><br /><br /><br /><span style="font-family:courier new;">/**************Alternate way***************************/<br /></span></span><span style=";font-family:courier new;font-size:85%;" >data original;<br />set original;<br />proc means data=original;<br />var ahbmi ah98 ah99 ah9900 hipcr;<br />OUTPUT OUT=starwars MEAN=avbmi av98 av99 STD=stbmi stah98 stah99 N=nbmi n98 n99 ;<br />run;<br /></span><pre style="font-family:courier new;"><span style=";font-family:courier new;font-size:85%;" >/*creates means ,standard deviation and no of obs and puts them in dataset called starwars which has only one observation*/<br /><br /><br /></span><span style="font-size:85%;">data _null_;<br />set starwars;<br />call symput("bmibar",avbmi); /*creates macro var bmibar that has the value of avbmi*/<br />call symput("a98bar",av98);<br />call symput("a99bar",av99);<br />call symput("s98",stah98);<br />call symput("s99",stah99);<br />call symput("sbmi",stbmi);<br />run;<br /><br />%put mean of bmi is &bmibar;<br />%put mean of ah98 is &a98bar;<br />%put mean of ah99 is &a99bar;<br /><br />data original;<br />set original;<br />ceterbmi=avbmi-&bmibar;</span><span style="font-size:85%;"> /*centers bmi*/<br />bmisd=ahbmi/&sbmi;</span><span style="font-size:85%;"> /*creates variable to do regression with each unit increment of standard deviation */<br />run;<br /></span><span style="font-size:85%;"><br /><span style="font-family:courier new;">/**************Alternate way***************************/<br /></span></span><span style="font-size:85%;"><span style="font-family:courier new;">/**************Standardized Coefficients***************************/</span></span><br /><span style="font-size:85%;"><br />proc reg;<br />model dependent= independent1 independent2 independent3/stb;<br />run;<br /><br />This gives standardized estimates i.e. when all variables in the<br />models (including dependent variable) are standardized to zero<br />mean and unit variance. Each coefficient indicates the number<br />of SD change in the dependent variable with a SD change in the<br />independent variable holding constant all other variables constant.<br />This is useful to compare the relative importance of independent<br />variables independent of the scales.</span><br /></pre>Unknownnoreply@blogger.comtag:blogger.com,1999:blog-5055696089992083602.post-66995807903930388282007-03-14T11:14:00.000-07:002007-03-19T19:48:14.200-07:00Productivity tools for sasComparison of two SAS programs: Sometimes similar looking files lead to different results. Following programs help finding the difference.<br /><br /><a href="http://www.scootersoftware.com/download.php">Beyond Compare</a><br /><a href="http://www.download.com/3001-2248_4-10619470.html">ExamDiffPro</a><br /><br /><br /><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.prestosoft.com/images/screenshots/edpro_screen_text.jpg"><img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 320px;" src="http://www.prestosoft.com/images/screenshots/edpro_screen_text.jpg" alt="" border="0" /></a><br /><br />Extracting columns of results from output e.g.<br /><br /><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj4PUwiyFzCmP_pVvk4RWGxzEUhqxOHhoELU3kkrtBO1rE20j-LvHOjGFuvefLrRdv_Au101eXm-35g6MvhkTsUa-6IIX9L6gtXw_QR7-1ynMPklDymMatd8lPoCJKoIPVgNDu0PUa3URC9/s1600-h/table1.png"><img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj4PUwiyFzCmP_pVvk4RWGxzEUhqxOHhoELU3kkrtBO1rE20j-LvHOjGFuvefLrRdv_Au101eXm-35g6MvhkTsUa-6IIX9L6gtXw_QR7-1ynMPklDymMatd8lPoCJKoIPVgNDu0PUa3URC9/s320/table1.png" alt="" id="BLOGGER_PHOTO_ID_5041865835759645842" border="0" /></a><br /><br />without painful editing as<br /><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjNHRPtVMN0scHEyeb9-HxD1upmKD9_MtMObzBgjSqz8GPQlcpGJQVXTMJdPj3x3_uw3-fdYCibmlyh4NQjkzAb5zlMxp02VDVZVkAJcKNbfozr43w_9sLBsGUbvk8DEW4V7yb3yIRUhFfE/s1600-h/table2.png"><img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjNHRPtVMN0scHEyeb9-HxD1upmKD9_MtMObzBgjSqz8GPQlcpGJQVXTMJdPj3x3_uw3-fdYCibmlyh4NQjkzAb5zlMxp02VDVZVkAJcKNbfozr43w_9sLBsGUbvk8DEW4V7yb3yIRUhFfE/s200/table2.png" alt="" id="BLOGGER_PHOTO_ID_5041866269551342770" border="0" /></a><br /><a href="http://www.textpad.com/download/index.html">TextPad</a><br /><a href="http://www.xemacs.org/Download/win32/">xemacs for windows</a>Unknownnoreply@blogger.com1tag:blogger.com,1999:blog-5055696089992083602.post-249378076459337872007-03-08T10:39:00.000-08:002007-03-08T10:52:45.689-08:00Prevent code from executingMy sas programs have a lot of code which I need intermittently but I don't need it to run everytime the program executes. Since, I need the code from time to time, I am reluctant to delete it.<br /><br />There are various ways to prevent code from executing:<br /><ol><li>/*comment out using asterisk slash*/ pieces of code enclosed by asterisk slash are treated as comments by sas and ignored.</li><li>*comment using asterisk; pieces of code enclosed by asterisk and ';' are treated as comments and ignored by sas. <br /></li><li>%macro junk; enclose code as a macro; %mend; pieces of code enclosed as macro are ignored. This is extremely useful since method1 and 2 only work if there are no other comments in between. Junk can be replaced with any word.<br /></li></ol>e.g.<br /><br />/* this is sample code */<br />proc means;<br />table age;<br />run;<br /><br />/* this is second proc */<br />proc univariate;<br />var age;<br />run;<br /><br /><br />If this code needs to be ignored as a whole, method 1 would generate a error and method 2 will be tedious. This kind of code can be prevented from executing as follows:<br /><br />%macro abcdg;<br />/* this is sample code */<br />proc means;<br />table age;<br />run;<br /><br />/* this is second proc */<br />proc univariate;<br />var age;<br />run;<br /><br />%mend;Unknownnoreply@blogger.com2tag:blogger.com,1999:blog-5055696089992083602.post-87117490768580743202007-03-05T06:35:00.001-08:002007-03-05T06:41:20.446-08:00Find missing informationMissing information bugs me! Missing missing information bugs me more!!<br /><br />However, there is an easy way to find variables with missing information. This method can be used for both continuous and categorical variables. Try the following after changing the keywords in <span style="color: rgb(204, 102, 204);">color</span> with your data specific names.<br /><pre><strong>proc means data=<span style="color: rgb(204, 51, 204);"><span style="color: rgb(204, 102, 204);">trott1</span> </span>NMISS </strong><strong><strong>N </strong></strong><strong>;<br />var <span style="color: rgb(204, 102, 204);">sbp dbp age smoking alcohol</span>;<br />run;</strong><br /></pre>More detailed information is available <a href="http://www.ats.ucla.edu/stat/SAS/faq/nummiss_sas.htm">here</a>.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-5055696089992083602.post-35066469165287211442007-01-07T04:32:00.000-08:002007-01-07T09:11:22.239-08:00Using Proc Genmod for logistic, poisson and log binomial regression<div style="text-align: justify;">PROC GENMOD is a procedure for fitting generalized linear models. This procedure is flexible and offers various advantages.<br /><br />Indicator variables do not have to be constructed in advance because it uses a class statement for specifying categorical (classification) variables.<br />Interactions can be fitted by specified by using asterisk, for example, batch*gender. <br />In some procedures, variables necessarily have to be numerical. However, in proc genmod, the variables (both outcome and explanatory) can be character.<br />Proc genmod reports log likelihood ratio for each variable in the model.<br />Because of the generalized nature, different models can be fitted with one procedure.<br /></div><span style="font-size:100%;"><br /></span><pre><span style=";font-family:courier new;font-size:100%;" >data file11;<br />/* generate data using random numbers. details <a href="http://www2.sas.com/proceedings/sugi22/CODERS/PAPER74.PDF">here</a>*/<br />DO MINUTE=0 TO 1000 BY 1;<br />X=UNIFORM (0);<br />X1=UNIFORM (15452);<br />X2=UNIFORM (29561);<br />OUTPUT ;<br />END;<br />run;<br />data filea;<br />set file11;<br />if x>0.5 then gender=1;<br />if x<=0.5 then gender=0; if x1>0.8 then emmig=1;<br />if x1<=0.8 then emmig=0; if x1>0.8 then emmig1='yes';<br />if x1<=0.8 then emmig1='no'; if x2>0.4 then cat=1;<br />if x2<=0.4 then cat=0; id=_n_; /* following creates a <a href="http://www.ats.ucla.edu/STAT/SAS/modules/collapse.htm">compressed/collapsed </a>dataset (fam8)<br />with the same information as original dataset<a href="http://www.ats.ucla.edu/STAT/SAS/modules/collapse.htm"></a>*/<br />if cat=0 then do;<br />if gender=1 and emmig=1 then index=1;<br />if gender=0 and emmig=1 then index=2;<br />if gender=1 and emmig=0 then index=3;<br />if gender=0 and emmig=0 then index=4;<br />end;<br />else if cat=1 then do;<br />if gender=1 and emmig=1 then index=5;<br />if gender=0 and emmig=1 then index=6;<br />if gender=1 and emmig=0 then index=7;<br />if gender=0 and emmig=0 then index=8;<br />end;<br />PROC MEANS DATA=filea NWAY NOPRINT ;<br />CLASS index gender emmig cat emmig1;<br />VAR index ;<br />OUTPUT OUT=fam8 SUM=number;<br />RUN;<br /><br />data file;<br />set fam8;<br />drop _type_ number;<br />numb=_freq_;<br />id=_n_;<br /><br />/* Calculate odds ratio using logistic regression */<br /></span><span style="font-size:100%;"><span style="font-family:courier new;">proc genmod data=file descending ;<br />class cat ;<br />freq numb; /* method to analyze aggregate data */<br />model emmig1 = cat/ dist=binomial link=logit ;<br />estimate 'Beta' cat 1 -1/ exp;<br />run;<br /><br /></span></span><span style=";font-family:courier new;font-size:100%;" >/* Calculate risk ratio using log binomial regression */<br /></span><span style=";font-family:courier new;font-size:100%;" >proc genmod data=filea descending ;<br />class cat ;<br />model emmig = cat/ dist=binomial link=log ;<br />estimate 'Beta' cat 1 -1/ exp;<br />run;<br /><br /></span><span style=";font-family:courier new;font-size:100%;" >/* Calculate risk ratio using<span style="font-family:monospace;"> </span></span><span style="font-size:100%;">Poisson Regression with Robust Error Variance</span><span style=";font-family:courier new;font-size:100%;" >*/<br /></span><span style="font-family:courier new;font-size:100%;">proc genmod data=filea ;<br />class cat id;<br />model emmig = cat/ dist=poisson link=log ;<br />repeated subject = id/ type = unstr;<br />estimate 'Beta' cat 1 -1/ exp;<br />run;<br /><br />/*logic of using risk ratio <span style="font-style: italic;">vs </span>odds ratio and details of log-binomial and<br />poisson regression is <a href="http://www.ats.ucla.edu/STAT/SAS/faq/relative_risk.htm">here</a>*/<br /></span><br /></pre>Unknownnoreply@blogger.com34tag:blogger.com,1999:blog-5055696089992083602.post-91922609749306169692006-12-30T08:57:00.000-08:002006-12-30T09:06:23.693-08:00Finding the number of times a threshold has exceeded cutoffI was approached by a student who had a time series data (120 individual animals observed at 250 time points). She want to find the following:<br /><br />1)How man times an animal's outcome has been above a threshold for 5, 6,7 ... j consecutive times?<br /><br />2) An animal can be above a threshold for 'j' consecutive times and then go below it for 'n' consecutive time points and go above it for 'k' consecutive time points . I would prefer this pattern to be counted distinctly and also as only once.<br /><br /><span style="font-size:85%;"><br /><span style="font-family: courier new;">/*this solution was offered through SAS-L */</span><br /><span style="font-family: courier new;">/*Generate test data */</span><br /><span style="font-family: courier new;"> data test;</span><br /><span style="font-family: courier new;"> do animal = 1 to 120;</span><span style="font-family: courier new;"><br /> do time = 1 to 250;</span><span style="font-family: courier new;"><br /> outcome = floor(10*ranuni(123) );</span><span style="font-family: courier new;"><br /> output;</span><span style="font-family: courier new;"><br /> end;</span><span style="font-family: courier new;"><br /> end;</span><span style="font-family: courier new;"><br /> run;</span><span style="font-family: courier new;"><br /><br />/*First step is to create a variable indicating whether the threshold (3, for example) is exceeded*/</span><br /><span style="font-family: courier new;"><br /> data step1 /*/ view=step1/*;</span><span style="font-family: courier new;"><br /> set test;</span><span style="font-family: courier new;"><br /> over = (outcome > 3);</span><span style="font-family: courier new;"><br /> run;<br /></span><span style="font-family: courier new;"><br />/* Next, reduce to one observation for each series of consecutives */</span><br /><span style="font-family: courier new;"><br /> data step2(drop = outcome) /* /view=step2 */;</span><span style="font-family: courier new;"><br /> do consecutive = -1 by -1 until (last.over);</span><span style="font-family: courier new;"><br /> set threshold;</span><span style="font-family: courier new;"><br /> by animal over notsorted;</span><span style="font-family: courier new;"><br /> end;</span><span style="font-family: courier new;"><br /> run;</span><br /><br /><span style="font-family: courier new;"> proc freq data=step2;</span><span style="font-family: courier new;"><br /> tables consecutive / nopercent;</span><span style="font-family: courier new;"><br /> where over;</span><span style="font-family: courier new;"><br /> run;</span><br /></span>Unknownnoreply@blogger.com2tag:blogger.com,1999:blog-5055696089992083602.post-22341481941795036692006-12-30T08:29:00.000-08:002006-12-30T08:57:36.437-08:00Finding the largest observationThere might be many ways to do this. A <span style="font-style: italic;">simple way </span>would be to as follows:<br /><br /><span style="font-size:85%;"><span style="font-family:courier new;">data new;</span><br /><span style="font-family:courier new;">Infile '/udd/n2man/epimarks/classmarks.txt';</span><br /></span><span style="font-size:85%;"><span style="font-family: courier new;"> input roll 4. ExamDate MMDDYY10. @17 MidFinal $2. Marks 4.;</span><br /></span><span style="font-size:85%;"><span style="font-family:courier new;">obsno=_n_;<br />/* creates a new variable obsno whose value is the same as observation number in the dataset */</span><br /><br /><span style="font-family:courier new;">proc univariate;</span><br /><span style="font-family:courier new;">var marks;</span><br /><span style="font-family:courier new;">run;</span><br /><br /><span style="font-family:courier new;">/* see the observation number for the largest variable(s) in the output. Run the program again with following added*/</span><br /><br /><span style="font-family:courier new;">proc print;</span><br /><span style="font-family:courier new;">where obsno=15; /* 15 is the obsno from the output of proc univariate */</span><br /><br /><span style="font-style: italic;"></span></span><span style="font-style: italic;">A sophisticated way</span><br /><span style="font-size:85%;"><br /><span style="font-family: courier new;">data new2;</span><br /></span><span style="font-family: courier new;font-size:85%;" >Infile '/udd/n2man/epimarks/classmarks.txt';<br /></span><span style="font-size:85%;"><span style="font-family: courier new;"> input roll 4. ExamDate MMDDYY10. @17 MidFinal $2. Marks 4.;</span><br /><br /><span style="font-family: courier new;">proc sort data =new2;</span><br /><span style="font-family: courier new;"> by descending marks;</span><br /><br /><span style="font-family: courier new;">data _null_ ; /* null datasets only exist for the particular datastep where they are called */</span><br /><span style="font-family: courier new;"> set new2 ;</span><br /><span style="font-family: courier new;"> If _n_=1 then call symput("IDNumber",Roll);</span><br /><span style="font-family: courier new;">/* this creates a macro variable IDNumber whose value is the roll of the first observation which is the one with largest marks because of sorting*/</span><br /><span style="font-family: courier new;"> else stop;</span><br /><br /><span style="font-family: courier new;">proc print data=new2;</span><br /><span style="font-family: courier new;"> where Roll="&IDnumber";</span><br /><span style="font-family: courier new;"> format ExamDate WORDDATE18.;</span><br /><span style="font-family: courier new;"> title "Student &IDNumber Had the highest marks ";</span><br /><span style="font-family: courier new;">run;</span></span>Unknownnoreply@blogger.com5tag:blogger.com,1999:blog-5055696089992083602.post-17376732348837759702006-12-30T08:25:00.000-08:002006-12-30T08:29:09.945-08:00Redirecting sas log and output filesWhen sas program mal or mal.sas is run on unix, it would produce output files sql.log and sql.lst. It is possible to redirect these files as follows:<br /><br /> <span style="font-size:85%;"><span style="font-family: courier new;"> sas mal.sas -log mallog1 -print mresults &</span></span><br /><br />where mallog1 and mresults could be any valid Unix file names. The ampersand (&) ensures that the sas program is run in the background (see earlier posts).Unknownnoreply@blogger.com33tag:blogger.com,1999:blog-5055696089992083602.post-40008475871343854502006-12-25T21:36:00.000-08:002006-12-25T21:40:20.527-08:00Running SAS jobs in background<p>If you want to continue editing programs while SAS jobs run in the background, you can do that by placing an ampersand ("&") after program name. For example: </p><pre> sas myprog &<br /></pre>Unknownnoreply@blogger.com0