Getting Started with the SAS Macro Language

SAS's macro language is one of its most powerful features, and understanding it is essential if you plan to write complex SAS programs. This article introduces some basic concepts and is intended to help total beginners get their head around what the macro language is, why you need it and how to use it. You can find SAS's documentation for the Macro language here (version 8) and here (version 9).

Why does SAS even have a macro language? Without using the macro language, a base SAS program would only contain DATA steps and PROC steps, plus a few additional statements like TITLEs, OPTIONS etc. Although the DATA step language is quite rich, allowing you to write complex conditional statements, loops, loops within loops, and so on, that richness is confined to that particular DATA step. Viewed as a whole, a SAS program starts at the top, executes each DATA step and PROC step in the specified order and then ends; there's no way to tell SAS to repeat a certain PROC four times, for example.

That's where the macro language comes in - the macro language adds a completely new layer of richness and control to your base SAS programs.

Key Concepts

Here are a few key concepts that are essential to understanding SAS Macro programs. If you don't have a very clear 'mental model' of these concepts, you will constantly encounter baffling situations that make no sense to you! I'm putting these here, right at the start, just to emphasise their importance. Don't worry if they don't mean much to you now - just remember them.

  • Macro programs generate Base SAS programs
  • All macro variables get resolved before the step is submitted
  • Macro variables are always character variables

These aren't hard-and-fast rules, and some experienced SAS programmers may complain that they're wrong, but if you behave as if they're true when first learning Macro, you won't go far wrong. You can start exploring more advanced techniques when you've got a firm footing in the basics.

Let's look at them in more detail, and start learning some Macro.

Macro programs generate Base SAS programs

Fundamentally, the Macro language is a code-generator; you write a macro that generates Base SAS code, which is then automatically submitted to the SAS System in the usual way. It's the generated code that does the work - creating output datasets, listing, reports, etc - not the macro code. The macro code is a stepping stone - a way to generate the necessary Base SAS code without having to write it all out in full every time.

The simplest macro simply generates the same fixed block of Base SAS code every time it's run:

%macro myMacro; data myData; x = 1000; run; %mend;

You can use the MPRINT option to tell SAS to show you the generated code as it's submitted:

This is a great way to confirm that your macro program is doing what you intended:

sd fs df

So what happens when we invoke the macro? The macro processor takes each line of the macro definition, in order, and looks for any macro variable references. If there are any such references, it will try to 'resolve' them - in other words, it will try to replace the macro variable reference with the current value of the macro variable. If it finds (and successfully resolves) any references, it then looks at that line again, looking for more references - just in case the first round of checking-and-resolving has resulted in more potential macro variable references! As soon as the processor finds that there are no more macro variable references left in the line, it gets passed to the normal Base SAS processor.

Importantly, as soon as the Base SAS processor has received enough lines of code to form a complete executable step - either a whole data step, a whole PROC step or an immediately-executable statement like a TITLE statement, it gets executed. The macro processor waits until the Base code has finished before processing the next line. This is crucial in understanding how a data step can influence the value of macro variables later in the macro, so read it twice!