How To Produce Almost Perfect RTF Output Ins & Outs NESUG 2006
Transcription
How To Produce Almost Perfect RTF Output Ins & Outs NESUG 2006
NESUG 2006 Ins & Outs How To Produce Almost Perfect RTF Output Suzanne M. Dorinski, US Census Bureau, Washington, DC ABSTRACT The Census Bureau performs an annual data collection for the National Center for Education Statistics. As part of the file documentation, the Census Bureau produces output that shows the imputation flag distribution for more than 140 variables. Census also produces a summary table that shows the minimum, maximum, and mean value for each continuous variable, along with the number of records that have a value greater than or equal to zero, the number of records with missing values, and the number of records with “not applicable” values. The output can be easily produced as RTF (Rich Text Format), but we need to use many little tricks to produce beautiful Microsoft Word output. The little tricks include modifying the PRINTER style, modifying the one way frequency table template, using ODS RTF TEXT= statements with RTF control words, using inline formatting on the footnote statement, and applying style definitions to PROC PRINT to produce RTF output that almost exactly meets the publication style. Setting up the SAS® code only needs to be done one time, thus eliminating a lot of manual formatting in Microsoft Word. INTRODUCTION SAS® ODS RTF output has been available since version 8.1. With each new release of SAS, there are more ways to automatically customize the RTF output within SAS, which eliminates more and more of the manual formatting in Microsoft Word. The little tricks to produce beautiful Microsoft Word output are especially helpful for reports that are generated repeatedly over time. This paper brings together some of the little tricks all in one place, using SAS 9.1.3. THE DATA USED IN THIS PAPER The Census Bureau collects revenue and expenditure data from each state education agency for the National Public Education Financial Survey (NPEFS) each year. The National Center for Education Statistics (NCES) is the agency that sponsors the data collection. The revenue and expenditure data covers pre-kindergarten through grade 12. Each year, NCES publishes a report based on the data and makes the dataset available to the public. This paper uses the Excel file for the Fiscal Year (FY) 2002 NPEFS, which is available at http://www.nces.ed.gov/ccd/stfis.asp. NCES also publishes documentation for each file. The FY 2002 NPEFS file documentation is available at http://nces.ed.gov/ccd/pdf/stfis02gen1c.pdf. The report based on the data is “Revenues and Expenditures for Public Elementary and Secondary Education: School Year 2001-02”, which is available at http://nces.ed.gov/pubsearch/pubsinfo.asp?pubid=2004341. The data is edited. Missing data are imputed. Appendix G of the file documentation has two tables. Table G-1 shows the frequencies of the imputation flags for each item on the survey, while Table G-2 shows the minimum, maximum, and mean value for each item on the survey, along with a count of the number of missing values and the number of “not applicable” values. We can use PROC FREQ to produce the frequencies of the imputation flags for each item on the survey. In the Excel spreadsheet, if the item is missing, it has the value –1, and if the item is not applicable, it has the value –2. We can use a macro to recode the negative values to special missing values and set dummy indicator variables. Then PROC MEANS can calculate the minimum, maximum, and mean value for each item on the survey. Another PROC MEANS can count the number of missing or not applicable values for each item on the survey. The relevant information is stored in a data set, and PROC PRINT produces the report. HOW TO FORMAT THE OUTPUT IN TABLE G-1 The first page of Table G-1 from the NPEFS file documentation is shown on the next page. There are several titles, then a section of text that defines the imputation flag values, followed by the PROC FREQ output. The variable labels above each table in the PROC FREQ output are in bold. BOLD THE VARIABLE LABELS ABOVE EACH PROC FREQ TABLE We can use the PRINTER style to format the output, but the PRINTER style defines the heading font as bold, which will make the second row of each table (variable, Frequency, Percent, Cumulative Frequency, and Cumulative Percent) bold also. To make only the variable label appear in bold, we modify the PRINTER style to take the bold out of the heading font. To make the variable label bold, we modify the Base.Freq.OneWayFreqs template to apply the bold font to h1. 1 NESUG 2006 Ins & Outs This is the first page of Table G-1 from the NPEFS file documentation. 2 NESUG 2006 Ins & Outs SPECIFY ONLY ONE DECIMAL PLACE FOR THE PERCENTAGES The flag frequency tables show the percentage and cumulative percentage with one decimal place. PROC FREQ shows two decimals by default. We can apply number formats to the variables in the modified Base.Freq.OneWayFreqs template. MAKE THE PROC FREQ TABLES HAVE UNIFORM WIDTH Each column in the PROC FREQ tables will be as wide as the widest cell in that column. The column showing the survey item flag will vary in width from table to table because the item flag names are different lengths, and we are using a proportional font. The shortest item flag name has only 3 characters, while the longest item flag has 8 characters. To make the survey item flag the same width across tables, we apply the cellwidth style to Fvariable in the modified Base.Freq.OneWayFreqs template. SHOW LINE ONLY UNDER ROW 2 OF PROC FREQ OUTPUT The table style in the Printer template controls what kind of lines are in the PROC FREQ tables. Specify rules=groups and frame=void to get only a line under the second row of each table. THE MODIFIED BASE.FREQ.ONEWAYFREQS TEMPLATE The modified template for the PROC FREQ is shown below. ODS PATH WORK.TEMPLAT(UPDATE) SASHELP.Tmplmst(READ); proc template; edit Base.Freq.OneWayFreqs; edit h1; style={font_weight=bold}; * make label above each table appear in bold font ; end; edit FVariable; style={cellwidth=1in}; * make all variable columns have same width in output ; end; edit Frequency; format=10.; end; edit Percent; format=10.1; * want only one decimal place displayed ; end; edit CumPercent; format=10.1; end; edit CumFrequency; format=10.; end; end; run; THE MODIFIED PRINTER STYLE TEMPLATE The modified printer style template is shown below. proc template; define style styles.newprinter; parent=styles.printer; replace color_list "Colors used in the default style" / 'link'= blue 'bgH'= white /* default is graybb */ 'fg' = black 'bg' = white; replace fonts / 'TitleFont2' = ("Times Roman",12pt,Bold Italic) 'TitleFont' = ("Times Roman",12pt,Bold ) /* default is 13 pt bold italic */ 'StrongFont' = ("Times Roman",10pt,Bold) 'EmphasisFont' = ("Times Roman",10pt,Italic) 'FixedEmphasisFont' = ("Courier",9pt,Italic) 'FixedStrongFont' = ("Courier",9pt,Bold) 'FixedHeadingFont' = ("Courier",9pt,Bold) 'BatchFixedFont' = ("SAS Monospace, Courier",6.7pt) 'FixedFont' = ("Courier",9pt) 'headingEmphasisFont' = ("Times Roman",12pt,Bold Italic) /* default is 11 pt*/ 3 NESUG 2006 Ins & Outs 'headingFont' = ("Times Roman",10pt) /* default is 11 pt bold */ 'docFont' = ("Times Roman",10pt) ; style body from document / leftmargin=1in rightmargin=1in topmargin=0.6in /* guess on top and bottom margin from NCES PDF */ bottommargin=0.4in; style rowheader from rowheader / background=_undef_ font=fonts('docFont'); style table from table / rules=groups /* only line in each table is below header */ frame=void cellpadding=1pt /* minimize space inside cells */ outputwidth=100%; /* outputwidth=100% forces all tables to use entire width of page */ style UserText from UserText "Controls the TEXT= style" / outputwidth=100% /* forces all ODS RTF TEXT= boxes to use entire width of page */ protectspecialchars=off; style systemtitle from systemtitle / protectspecialchars=off; /* allow me to insert RTF control words */ end; run; HOW TO PRODUCE THE EM DASH AND EN DASH IN TITLES The dash between the words “G.” and “Value” is an em dash, while the dash between “2001” and “02” is an en dash. To produce these special dashes, we need to use inline formatting in the title statements. The em dash is produced using the \emdash control word, while the en dash is produced using the \endash control word. The title statements look like this: title "Appendix G.\emdash Value Distribution and Field Frequencies"; title2 "\b0 Revised File"; title3 "\line \ql ^S={font_size=10pt}Table G-1. Frequencies of imputation flags, state finance survey: 2001\endash 02\emdash Continued"; We use the word Continued in title3 because pages 2 through the end of the output need to show that word in the title. OTHER SPECIAL RTF CONTROL WORDS The \line shown in title3 above is the RTF control word that moves the text to the next line. \ql causes the text to be leftjustified. The titles will be in Times Roman 12 pt bold because of the TitleFont specification in the modified printer style template. If we need to turn off the bold on one of the title lines, we can use \b0 to do that, as shown in the title2 statement above. HOW TO GET THE IMPUTATION FLAG DEFINITIONS INTO THE OUTPUT We use ODS RTF TEXT= to put the phrase Imputation Flags into the output. The \qc control word centers the text, the \b control word puts the text in bold, while the \ul control word underlines the text, as shown below. ODS RTF TEXT="\qc \b \ul Imputation Flags"; We use a data step to set up the imputation flag definitions, and then apply style elements to the PROC PRINT to format the text. data flags; length letter $ 1 sign $ 1 descrip $ 54; input letter sign descrip $54.; cards; R = As reported by the state A = Adjustment I = Imputed based on a method other than prior year's data T = Total based on sum of internal or external detail C = Combined with data provided elsewhere by the state ; 4 NESUG 2006 Ins & Outs run; * need to have footnotes on the proc print to get footnotes to show on first page! ; proc print data=flags noobs style(report)={rules=none} /* no lines in the table */ style(header)={foreground=white /* foreground=white means labels won't be visible */ cellheight=0.1pt}; /* cellheight making header row very tiny */ var letter sign / style(data)={just=c}; var descrip; footnote1 "^S={just=l font_weight=medium font_size=10pt}See notes at end of table."; footnote3 "^S={just=c font_weight=medium font_face=arial font_size=8.5pt}{G}^{thispage}"; run; ODS RTF sectiondata="\sbknone"; * want proc freq to start in same section as flag definitions ; HOW TO FORMAT THE FOOTNOTES The footnotes are 12pt Times Roman by default, due to the TitleFont specification in the modified Printer style template. The ^S={ } are the styles applied in the footnote statement, which override the TitleFont specification. The ^ was defined as the ODS escape character, so SAS knows that a style is being applied. Footnote1 will be Times Roman 10 pt, not bolded, and left justified. Footnote3 will be Arial 8.5pt, not bolded, and centered. The footnotes are shown in the imputation flag definitions PROC PRINT so that they will show up on the first page of the output, and each succeeding page. THE PROC FREQ OUTPUT Assuming that the data set contains only the imputation flags, the PROC FREQ output is easy to produce with the text shown below. ODS RTF TEXT="\line"; proc freq data=doc_output; tables _all_ ; run; The ODS RTF TEXT= statement inserts a blank line between the imputation flag definitions and the first table of the PROC FREQ output. All the formatting we needed to do is taken care of in the modified Base.Freq.OneWayFreqs template and the modified PRINTER style template. NOTES FOR THE LAST PAGE OF TABLE G-1 The last page of Table G-1 needs notes at the end of the table to show the source of the data. We can insert these notes with several ODS RTF TEXT= statements, as shown below. ODS RTF TEXT="\line \line"; ODS RTF TEXT="\ql ^S={font_size=11pt}Source: Data reported by states to U.S. Department of Education, National Center for Education"; ODS RTF TEXT="\ql ^S={font_size=11pt}Statistics, Common Core of Data (CCD), National Public Education Finance Survey"; ODS RTF TEXT="\ql ^S={font_size=11pt}(NPEFS) FY 2002, (stfis021c)."; ANYTHING WE STILL NEED TO MANUALLY EDIT IN WORD FOR TABLE G-1? Unfortunately, yes. The Table G-1 title should not say “Continued” on the very first page, and “See notes at end of table” should not appear on the very last page of Table G-1. The user can delete the “Continued” in the header on the first page, and the “See notes at end of table” in the footer on the last page. Each PROC FREQ table is in a separate section, so the headers and footers in the RTF are independent of one another. 5 NESUG 2006 Ins & Outs This is the first page of Table G-1 from SAS 9.1.3, before deleting the word “Continued”. 6 NESUG 2006 Ins & Outs HOW TO FORMAT THE OUTPUT IN TABLE G-2 The first page of Table G-2 from the NPEFS file documentation is shown on the next page. The relevant data is calculated by a macro and stored in a dataset. PROC PRINT produces the report. We can use the same modified PRINTER style template, with some slight changes. The output for Table G-2 is wider, so we set the left and right margins to 0.5 inches. The code below shows the style applied to the PROC PRINT to bold the header of the table. The title statements, footnote statements, and ODS RTF TEXT= statements are similar those discussed for Table G-1. The ^{thispage} will show 1 on the first page of this output, but when this output is inserted after the Table G-1 output, the page number will be continued from the page numbering of Table G-1. When both tables are added to the body of another document, the user will have to manually reset the page number to start at 1 for the first page of Table G-1. I still haven’t figured out how to set this properly in SAS. If anybody knows how to do it, please let me know! If you do the PROC PRINT on all 143 variables, the output is in just one section, which means that the user won’t be able to easily change the header on the first page and the footer on the last page. One admittedly clunky way to sort of fix this is to do the PROC PRINT on the number of observations that will fit on a page. I was getting 47 observations per page, so the first PROC PRINT is for firstobs=1 to obs=47, the second PROC PRINT is for firstobs=48 to obs=94, the third PROC PRINT is for firstobs=95 to obs=141, and the last PROC PRINT is for firstobs=142 to obs=143. The last PROC PRINT has style(report)={frame=hsides} to get the lines above and below the table. ODS ESCAPECHAR='^'; options center; proc print data=min_max_summary_report(firstobs=1 obs=47) label noobs style(header)={font_weight=bold} style(report)={frame=above}; title "Appendix G.\emdash Value Distribution and Field Frequencies"; title2 "\b0 Revised File"; title3 "^S={font_size=10pt}\line \ql Table G-2. Minimum, maximum, and mean values for continuous variables, state finance survey: 2001\endash 02 \emdash Continued"; var field description / style(header)={just=l}; var count min max mean missing_sum na_sum / style(header)={just=r}; format min max max mean comma20.1; label field='Variable' description='Label' count='N' min='Minimum' max='Maximum' mean='Mean' missing_sum='-1' na_sum='-2' ; footnote1 "^S={just=l font_weight=medium font_size=9pt}See notes at end of table."; footnote3 "^S={just=c font_weight=medium font_face=arial font_size=8.5pt}G-^{thispage}"; run; */ [3 other proc prints need to occur here to get complete report] */ ODS RTF TEXT="\line"; ODS RTF TEXT="\ql \ul Note:"; ODS RTF TEXT="\ql -1 = 'Missing'"; ODS RTF TEXT="\ql -2 = 'Not Applicable'"; ODS RTF TEXT="\ql Source: Data reported by states to the U.S. Department of Education, National Center for Education Statistics, Common Core of Data"; ODS RTF TEXT="\ql (CCD), National Public Education Finance Survey (NPEFS) FY 2002, (stfis021c)."; 7 NESUG 2006 Ins & Outs This is the first page of Table G-2 from the NPEFS file documentation. 8 NESUG 2006 Ins & Outs This is the first page of Table G-2 from SAS 9.1.3, before deleting the word “Continued”. 9 NESUG 2006 Ins & Outs ANYTHING WE STILL NEED TO MANUALLY EDIT IN WORD FOR TABLE G-2? Unfortunately, yes, just like Table G-1. The Table G-2 title should not say “Continued” on the very first page, and “See notes at end of table” should not appear on the very last page of Table G-2. The user can delete the “Continued” in the header on the first page, and the “See notes at end of table” in the footer on the last page. Each PROC PRINT table is in a separate section, so the headers and footers in the RTF are independent of one another. CONCLUSION By using a variety of techniques, we can produce almost perfect RTF output, which requires very little manual editing in Word. Output that needs to be produced on a regular basis can be automated in SAS. REFERENCES Haworth, Lauren. 2006. “PROC TEMPLATE: The Basics”. Proceedings of the Thirty-First Annual SAS Users Group International Conference. San Francisco, CA. Available at http://www2.sas.com/proceedings/sugi31/112-31.pdf. Haworth, Lauren. 2004. “SAS with Style: Creating your own ODS Style Template for RTF Output”, Proceedings of the Twenty-Ninth Annual SAS Users Group International Conference. Montreal, Quebec. Available at http://www2.sas.com/proceedings/sugi29/125-29.pdf. Haworth, Lauren E. Output Delivery System: The Basics, Cary, NC: SAS Institute Inc, 2001. McNeill, Sandy. 2001. “Changes & Enhancements for ODS by Example (through Version 8.2)”, Proceedings of the TwentySixth Annual SAS Users Group Conference. Long Beach, CA. Available at http://www2.sas.com/proceedings/sugi26/p002-26.pdf. Documentation of the RTF Specification: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnrtfspec/html/rtfspec.asp ACKNOWLEDGMENTS Thanks to Freda Spence and Mary Church for giving me the idea for this paper. Thanks to Kevin Smith and Kathryn McLawhorn of SAS Institute for answering questions on how to bold the labels on the PROC FREQ output but have the Frequency label show up in regular font. Thanks to Richard Sigman for showing me how to use call symput. Thanks to Carma Hogue, Rita Petroni, Kathy McDonald-Johnson, and Chris Boniface for their helpful comments on this paper. DISCLAIMER This report is released to inform interested parties of research and to encourage discussion. The views expressed on technical issues are those of the author and not necessarily those of the U.S. Census Bureau. CONTACT INFORMATION Your comments and questions are valued and encouraged. Contact the author at: Suzanne M. Dorinski U.S. Census Bureau Economic Statistical Methods and Programming Division Washington, DC 20233-6200 Work Phone: E-mail: 301-763-4869 [email protected] SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies. 10 NESUG 2006 Ins & Outs APPENDIX A: COMPLETE CODE FOR TABLE G-1 OUTPUT dm 'cle log; cle out'; *************************************************************; * this is field_frequencies_in_RTF.sas *; * *; * Use SAS ODS to put field frequency output into RTF. *; * *; * Suzanne M. Dorinski 6/4/06 *; * *; * This program written for SAS 9.1.3 *; *************************************************************; ODS ESCAPECHAR='^'; * escape character for ODS formatting ; title; * clear out previous title and footnote statements ; footnote; ODS LISTING CLOSE; * do not show output in listing window ; ODS PATH WORK.TEMPLAT(UPDATE) SASHELP.Tmplmst(READ); proc template; edit Base.Freq.OneWayFreqs; edit h1; style={font_weight=bold}; * make label above each table appear in bold font ; end; edit FVariable; style={cellwidth=1in}; * make all variable columns have same width in output ; end; edit Frequency; format=10.; end; edit Percent; format=10.1; * want only one decimal place displayed ; end; edit CumPercent; format=10.1; end; edit CumFrequency; format=10.; end; end; run; proc template; define style styles.newprinter; parent=styles.printer; replace color_list "Colors used in the default style" / 'link'= blue 'bgH'= white /* default is graybb */ 'fg' = black 'bg' = white; replace fonts / 'TitleFont2' = ("Times Roman",12pt,Bold Italic) 'TitleFont' = ("Times Roman",12pt,Bold ) /* default is 13 pt bold italic */ 'StrongFont' = ("Times Roman",10pt,Bold) 11 NESUG 2006 Ins & Outs style style style style style 'EmphasisFont' = ("Times Roman",10pt,Italic) 'FixedEmphasisFont' = ("Courier",9pt,Italic) 'FixedStrongFont' = ("Courier",9pt,Bold) 'FixedHeadingFont' = ("Courier",9pt,Bold) 'BatchFixedFont' = ("SAS Monospace, Courier",6.7pt) 'FixedFont' = ("Courier",9pt) 'headingEmphasisFont' = ("Times Roman",12pt,Bold Italic) /* default is 11 pt */ 'headingFont' = ("Times Roman",10pt) /* default is 11 pt bold */ 'docFont' = ("Times Roman",10pt) ; body from document / leftmargin=1in rightmargin=1in topmargin=0.6in /* guess on top and bottom margin from NCES PDF */ bottommargin=0.4in; rowheader from rowheader / background=_undef_ font=fonts('docFont'); table from table / rules=groups /* only line in each table is below header */ frame=void cellpadding=1pt /* minimize space inside cells */ outputwidth=100%; /* outputwidth=100% forces all tables to use entire width of page */ UserText from UserText "Controls the TEXT= style" / outputwidth=100% /* forces all ODS RTF TEXT= boxes to use entire width of page */ protectspecialchars=off; systemtitle from systemtitle / protectspecialchars=off; /* allow me to insert RTF control words */ end; run; ODS RTF FILE='C:\Documents and Settings\My Documents\NESUG 2006 paper\field_frequencies.rtf' STYLE=newprinter; ODS RESULTS OFF; * do not open up Word viewer in SAS session ; PROC IMPORT OUT= flag_data_to_be_summarized DATAFILE= "C:\Documents and Settings\My Documents\NESUG 2006 paper\stfis021c.xls" DBMS=EXCEL REPLACE; RANGE='E61:EQ117'; GETNAMES=YES; RUN; * note that complete label statement is NOT shown below ; data doc_output; set flag_data_to_be_summarized; label iR1A = 'iR1A - Local Revenue iR1B = 'iR1B - Local Revenue iR1C = 'iR1C - Local Revenue iR1D = 'iR1D - Local Revenue iR1E = 'iR1E - Local Revenue iR1F = 'iR1F - Local Revenue Property Tax' Non Property Tax' Local Government Property Tax' Local Government Non Property Tax' Individual Tuition' Tuition From Local Education Agency' 12 NESUG 2006 iTX12 = iNCE13 = iMEMBR01 = ; run; Ins & Outs 'iTX12 - Total Exclusions' 'iNCE13 - Net Current Expenditures' 'iMEMBR01 - Student Membership' ODS NOPROCTITLE; * suppress "The FREQ Procedure" in the RTF output ; * \emdash adds em dash to RTF, while \endash adds en dash to RTF ; title "Appendix G.\emdash Value Distribution and Field Frequencies"; title2 "\b0 Revised File"; title3 "\line \ql ^S={font_size=10pt}Table G-1. Frequencies of imputation flags, state finance survey: 2001\endash 02\emdash Continued"; options nodate nonumber CENTER; ODS RTF TEXT="\qc \b \ul Imputation Flags"; /* \qc specifies text is centered */ /* \b specifies text is bolded */ /* \ul specifies text is underlined */ data flags; length letter $ 1 sign $ 1 descrip $ 54; input letter sign descrip $54.; cards; R = As reported by the state A = Adjustment I = Imputed based on a method other than prior year's data T = Total based on sum of internal or external detail C = Combined with data provided elsewhere by the state ; run; * need to have footnotes on the proc print to get footnotes to show on first page! ; proc print data=flags noobs style(report)={rules=none} /* no lines in the table */ style(header)={foreground=white /* foreground=white means labels won't be visible */ cellheight=0.1pt}; /* cellheight makes header row very tiny */ var letter sign / style(data)={just=c}; var descrip; footnote1 "^S={just=l font_weight=medium font_size=10pt}See notes at end of table."; footnote3 "^S={just=c font_weight=medium font_face=arial font_size=8.5pt}{G}^{thispage}"; run; ODS RTF sectiondata="\sbknone"; * want proc freq to start in same section as flag definitions ; ODS RTF TEXT="\line"; proc freq data=doc_output; tables _all_ ; run; ODS RTF TEXT="\line \line"; ODS RTF TEXT="\ql ^S={font_size=11pt}Source: Data reported by states to U.S. Department of Education, National Center for Education"; ODS RTF TEXT="\ql ^S={font_size=11pt}Statistics, Common Core of Data (CCD), National 13 NESUG 2006 Ins & Outs Public Education Finance Survey"; ODS RTF TEXT="\ql ^S={font_size=11pt}(NPEFS) FY 2002, (stfis021c)."; ODS RTF CLOSE; * now get rid of style template that was created in this program ; proc template; delete Base.Freq.OneWayFreqs; delete styles.newprinter; run; * now open the listing window again ; ODS LISTING; 14 NESUG 2006 Ins & Outs APPENDIX B: COMPLETE CODE FOR TABLE G-2 OUTPUT dm 'cle log; cle out'; ********************************************************************; * this is value_distribution_in_RTF.sas *; * *; * Use SAS ODS to put value distribution output into RTF. *; * *; * Suzanne M. Dorinski 6/4/06 *; * *; * program written in SAS 9.1.3 *; ********************************************************************; options mprint nodate nonumber nosymbolgen; PROC IMPORT OUT= data_to_be_summarized DATAFILE= "C:\Documents and Settings\My Documents\NESUG 2006 paper\stfis021c.xls" DBMS=EXCEL REPLACE; RANGE='E1:EQ57'; GETNAMES=YES; RUN; * want to make sure variables in table are in same order as shown * in Excel spreadsheet ; proc sql; create table variable_list as select name, varnum from sashelp.vcolumn where libname='WORK' and memname='DATA_TO_BE_SUMMARIZED'; quit; proc sort data=variable_list; by name; run; * note that complete descriptive_text data set is NOT shown below ; data descriptive_text; input name $ description $60. ; cards; R1A LOCAL REV PROPERTY TAX R1B LOCAL REV NON PROPERTY TAX R1C LOCAL REV LOC GOVT PROP TAX R1D LOCAL REV LOC GOVT NON PROP TAX R1E LOCAL REV INDIVID TUITION R1F LOCAL REV TUITION FR LEA'S A14B ADA (NCES DEFINITION) PPE15 PER PUPIL EXPENDITURES MEMBR01 TOTAL STUDENTS ; run; proc sort data=descriptive_text; by name; run; 15 NESUG 2006 Ins & Outs * add descriptions to variable_list data set; data variable_list; merge variable_list(in=l) descriptive_text; by name; if l; run; proc sort data=variable_list; by varnum; run; /* how many numeric variables are in the report? /* in the min_max macro. need the count for the do loop */ */ proc sql noprint; select count(*) into :numobs from variable_list; quit; %macro min_max; /* /* /* /* /* /* min_max macro handles one numeric variable at a time. the macro counts the number of records with nonnegative values, calculates the minimum, maximum, and mean values, and also counts the number of records with –1 values or -2 values. the results are appended to the min_max_summary_report data set. */ */ */ */ */ */ %do i=1 %to &numobs; data _null_; /* this data step gets the numeric variable’s name and description */ obsnum=&i; set variable_list point=obsnum; if _error_ then abort; call symputx('field',name); call symputx('description',description); stop; run; /* /* /* /* the data step below is recoding the negative values so that the minimum and mean values are handled correctly by PROC MEANS. the data step is also creating two indicator variables which will be used by another PROC MEANS to count the number of records with missing or “not applicable” values. data &field; set data_to_be_summarized(keep=&field); if &field=-1 then do; &field=.M; &field._missing=1; end; if &field=-2 then do; &field=.N; &field._na=1; end; run; 16 */ */ */ */ NESUG 2006 Ins & Outs proc means data=&field n min max mean noprint; var &field; output out=&field._summary n=count min=min max=max mean=mean; run; data &field._summary; length field $ 8; set &field._summary(drop=_type_ _freq_); field="&field"; run; proc means data=&field sum noprint; var &field._missing &field._na; output out=&field._negative sum=missing_sum na_sum; run; data &field._negative; length field $ 8; set &field._negative(drop=_type_ _freq_); field="&field"; run; /* the data step below is combining the results from the first and second PROC MEANS */ data &field._report; length description $ 40; merge &field._summary &field._negative; by field; description="&description"; if missing_sum=. then missing_sum=0; if na_sum=. then na_sum=0; run; proc append base=min_max_summary_report data=&field._report; run; proc datasets nolist; delete &field &field._summary &field._negative &field._report; quit; %end; %mend min_max; %min_max ODS PATH WORK.TEMPLAT(UPDATE) SASHELP.Tmplmst(READ); proc template; define style styles.newprinter; parent=styles.printer; replace color_list "Colors used in the default style" / 17 NESUG 2006 Ins & Outs 'link'= blue 'bgH'= white /* bgH was graybb */ 'fg' = black 'bg' = white; replace fonts / 'TitleFont2' = ("Times Roman",12pt,Bold Italic) 'TitleFont' = ("Times Roman",12pt,Bold ) /* default is 13pt bold italic */ 'StrongFont' = ("Times Roman",12pt,Bold) /* default is 10 pt */ 'EmphasisFont' = ("Times Roman",10pt,Italic) 'FixedEmphasisFont' = ("Courier",9pt,Italic) 'FixedStrongFont' = ("Courier",9pt,Bold) 'FixedHeadingFont' = ("Courier",9pt,Bold) 'BatchFixedFont' = ("SAS Monospace, Courier",6.7pt) 'FixedFont' = ("Courier",9pt) 'headingEmphasisFont' = ("Times Roman",12pt,Bold Italic) /* default is 11 pt */ 'headingFont' = ("Times Roman",10pt) /* default is 11 pt bold */ 'docFont' = ("Times Roman",9pt); /* default is 10 pt */ style body from document / leftmargin=0.5in rightmargin=0.5in topmargin=0.6in /* guess top and bottom margin from NCES PDF */ bottommargin=0.4in; style rowheader from rowheader / background=_undef_ font=fonts('docFont'); style table from table / rules=groups /* only line inside table is below header */ frame=void cellpadding=1pt /* minimize space inside cells */ outputwidth=100%; /* outputwidth=100% forces all tables to use entire width of page */ style UserText from UserText "Controls the TEXT= style" / outputwidth=100% /* force ODS RTF TEXT= cells to use entire width of page */ protectspecialchars=off; style systemtitle from systemtitle / protectspecialchars=off; /* allow me to insert RTF control words */ end; run; ODS RTF FILE="C:\Documents and Settings\My Documents\NESUG 2006 paper\value_distribution.rtf" STYLE=newprinter; ODS RESULTS OFF; * do not open Word viewer in SAS session ; ODS ESCAPECHAR='^'; options center; proc print data=min_max_summary_report(firstobs=1 obs=47) label noobs style(header)={font_weight=bold} style(report)={frame=above}; title "Appendix G.\emdash Value Distribution and Field Frequencies"; 18 NESUG 2006 Ins & Outs title2 "\b0 Revised File"; title3 "^S={font_size=10pt}\line \ql Table G-2. Minimum, maximum, and mean values for continuous variables, state finance survey: 2001\endash 02 \emdash Continued"; var field description / style(header)={just=l}; var count min max mean missing_sum na_sum / style(header)={just=r}; format min max max mean comma20.1; label field='Variable' description='Label' count='N' min='Minimum' max='Maximum' mean='Mean' missing_sum='-1' na_sum='-2'; ; footnote1 "^S={just=l font_weight=medium font_size=9pt}See notes at end of table."; footnote3 "^S={just=c font_weight=medium font_face=arial font_size=8.5pt}G-^{thispage}"; run; proc print data=min_max_summary_report(firstobs=48 obs=94) label noobs style(header)={font_weight=bold} style(report)={frame=above}; /* line above table only */ var field description / style(header)={just=l}; var count min max mean missing_sum na_sum / style(header)={just=r}; format min max max mean comma20.1; label field='Variable' description='Label' count='N' min='Minimum' max='Maximum' mean='Mean' missing_sum='-1' na_sum='-2'; run; proc print data=min_max_summary_report(firstobs=95 obs=141) label noobs style(header)={font_weight=bold} style(report)={frame=above}; var field description / style(header)={just=l}; var count min max mean missing_sum na_sum / style(header)={just=r}; format min max max mean comma20.1; label field='Variable' description='Label' count='N' min='Minimum' max='Maximum' mean='Mean' missing_sum='-1' na_sum='-2'; run; proc print data=min_max_summary_report(firstobs=142 obs=143) label noobs style(header)={font_weight=bold} style(report)={frame=hsides}; /* line above and below table */ var field description / style(header)={just=l}; var count min max mean missing_sum na_sum / style(header)={just=r}; format min max max mean comma20.1; label field='Variable' description='Label' count='N' min='Minimum' max='Maximum' 19 NESUG 2006 Ins & Outs mean='Mean' missing_sum='-1' na_sum='-2'; run; ODS RTF TEXT="\line"; ODS RTF TEXT="\ql \ul Note:"; ODS RTF TEXT="\ql -1 = 'Missing'"; ODS RTF TEXT="\ql -2 = 'Not Applicable'"; ODS RTF TEXT="\ql Source: Data reported by states to the U.S. Department of Education, National Center for Education Statistics, Common Core of Data"; ODS RTF TEXT="\ql (CCD), National Public Education Finance Survey (NPEFS) FY 2002, (stfis021c)."; ODS RTF CLOSE; proc datasets nolist; delete min_max_summary_report; run; proc template; delete styles.newprinter; run; 20