SPSS Programming and Data Management, 3rd Edition A Guide for SPSS and SAS® Users Raynald Levesque and SPSS Inc. For more information about SPSS® software products, please visit our Web site at http://www.spss.com or contact: SPSS Inc. 233 South Wacker Drive, 11th Floor Chicago, IL 60606-6412 Tel: (312) 651-3000 Fax: (312) 651-3668 SPSS is a registered trademark and the other product names are the trademarks of SPSS Inc. for its proprietary computer software. No material describing such software may be produced or distributed without the written permission of the owners of the trademark and license rights in the software and the copyrights in the published materials. The SOFTWARE and documentation are provided with RESTRICTED RIGHTS. Use, duplication, or disclosure by the Government is subject to restrictions as set forth in subdivision (c) (1) (ii) of The Rights in Technical Data and Computer Software clause at 52.227-7013. Contractor/manufacturer is SPSS Inc., 233 South Wacker Drive, 11th Floor, Chicago, IL 60606-6412. General notice: Other product names mentioned herein are used for identification purposes only and may be trademarks of their respective companies. SAS is a registered trademark of SAS Institute Inc. Windows is a registered trademark of Microsoft Corporation. Microsoft® Access, Microsoft® Excel, and Microsoft® Word are products of Microsoft Corporation. DataDirect, DataDirect Connect, INTERSOLV, and SequeLink are registered trademarks of DataDirect Technologies. Portions of this product were created using LEADTOOLS © 1991–2000, LEAD Technologies, Inc. ALL RIGHTS RESERVED. LEAD, LEADTOOLS, and LEADVIEW are registered trademarks of LEAD Technologies, Inc. Portions of this product were based on the work of the FreeType Team (http://www.freetype.org). A portion of the SPSS software contains zlib technology. Copyright © 1995–2002 by Jean-loup Gailly and Mark Adler. The zlib software is provided “as-is,” without express or implied warranty. In no event shall the authors of zlib be held liable for any damages arising from the use of this software. A portion of the SPSS software contains Sun Java Runtime libraries. Copyright © 2003 by Sun Microsystems, Inc. All rights reserved. The Sun Java Runtime libraries include code licensed from RSA Security, Inc. Some portions of the libraries are licensed from IBM and are available at http://oss.software.ibm.com/icu4j/. Sun makes no warranties to the software of any kind. Sax Basic is a trademark of Sax Software Corporation. Copyright © 1993–2004 by Polar Engineering and Consulting. All rights reserved. SPSS Programming and Data Management, 3rd Edition: A Guide for SPSS and SAS Users Copyright © 2006 by SPSS Inc. All rights reserved. Printed in the United States of America. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means—electronic, mechanical, photocopying, recording, or otherwise—without the prior written permission of the publisher. 1 2 3 4 5 6 7 8 9 0 09 08 07 06 ISBN 1-56827-374-6 Preface Experienced data analysts know that a successful analysis or meaningful report often requires more work in acquiring, merging, and transforming data than in specifying the analysis or report itself. SPSS contains powerful tools for accomplishing and automating these tasks. While much of this capability is available through the graphical user interface, many of the most powerful features are available only through command syntax. With release 14.0.1, SPSS makes the programming features of its command syntax significantly more powerful by adding the ability to combine it with a full-featured programming language. This book offers many examples of the kinds of things that you can accomplish using SPSS command syntax by itself and in combination with the Python programming language. Using This Book The contents of this book and the accompanying CD are discussed in Chapter 1. In particular, see the section “Using This Book” if you plan to run the examples on the CD. The CD also contains additional command files, macros, and scripts that are mentioned but not discussed in the book and that can be useful for solving specific problems. This edition has been updated to include numerous enhanced data management features introduced in SPSS 14.0. Many examples will work with earlier versions, but some examples rely on features not available prior to SPSS 14.0. All of the Python examples require SPSS 14.0.1 or later. For SAS Users If you have more experience with SAS than with SPSS for data management, see Chapter 19 for comparisons of the different approaches to handling various types of data management tasks. Quite often, there is not a simple command-for-command relationship between the two programs, although each accomplishes the desired end. iii Acknowledgments This book reflects the work of many members of the SPSS staff who have contributed examples here and in SPSS Developer Central, as well as that of Raynald Levesque, whose examples formed the backbone of earlier editions and remain important in this edition. We also wish to thank Stephanie Schaller, who provided many sample SAS jobs and helped to define what the SAS user would want to see, as well as Marsha Hollar and Brian Teasley, the authors of the original chapter “SPSS for SAS Programmers.” A Note from Raynald Levesque It has been a pleasure to be associated with this project from its inception. I have for many years tried to help SPSS users understand and exploit its full potential. In this context, I am thrilled about the opportunities afforded by the Python integration and invite everyone to visit my site at www.spsstools.net for additional examples. And I want to express my gratitude to my spouse, Nicole Tousignant, for her continued support and understanding. Raynald Levesque iv Contents 1 Overview 1 Using This Book . 1 Documentation Resources . 2 Part I: Data Management 2 Best Practices and Efficiency Tips 5 Working with Command Syntax . 5 Creating Command Syntax Files . Running SPSS Commands . Syntax Rules . Customizing the Programming Environment . 5 6 7 8 Displaying Commands in the Log . 8 Displaying the Status Bar in Command Syntax Windows . 9 Protecting the Original Data . 10 Do Not Overwrite Original Variables. Using Temporary Transformations . Using Temporary Variables . Use EXECUTE Sparingly . 11 11 12 14 Lag Functions . Using $CASENUM to Select Cases. MISSING VALUES Command . WRITE and XSAVE Commands . Using Comments. 14 16 17 17 17 Using SET SEED to Reproduce Random Samples or Values


