Chapter 6. Flow control in scripts

published book

This chapter covers:

  • 6.1 The conditional statement
  • 6.2 Looping statements
  • 6.3 Labels, break, and continue
  • 6.4 The switch statement
  • 6.5 Flow control using cmdlets
  • 6.6 Statements as values
  • 6.7 A word about performance
  • 6.8 Summary

I may not have gone where I intended to go, but I think I have ended up where I needed to be.

Douglas Adams, The Long Dark Tea-Time of the Soul

Previous chapters showed how you can solve surprisingly complex problems in PowerShell using only commands and operators. You can select, sort, edit, you and present all manner of data by composing these elements into pipelines and expressions. In fact, commands and operators were the only elements available in the earliest prototypes of PowerShell. Sooner or later, though, if you want to write significant programs or scripts, you must add custom looping or branch logic to your solution. This is what we’re going to cover in this chapter: PowerShell’s take on the traditional programming constructs that all languages possess.

The PowerShell flow-control statements and cmdlets are listed in figure 6.1, arranged in groups.

Figure 6.1. The PowerShell flow-control statements

We’ll go through each group in this chapter. As always, behavioral differences exist with the PowerShell flow-control statements that new users should be aware of.

The most obvious difference is that PowerShell typically allows the use of pipelines in places where other programming languages only allow simple expressions. An interesting implication of this pipeline usage is that the PowerShell switch statement is both a looping construct and a conditional statement—which is why it gets its own group.

This is also the first time we’ve dealt with keywords in PowerShell. Keywords are part of the core PowerShell language. This means that, unlike cmdlets, keywords can’t be redefined or aliased. Keywords are also case insensitive so you can write foreach, ForEach, or FOREACH and they’ll all be accepted by the interpreter. (By convention, though, keywords in PowerShell scripts are usually written in lowercase.) Keywords are also context sensitive, which means that they’re only treated as keywords in a statement context—usually as the first word in a statement. This is important because it lets you have both a foreach loop statement and a foreach filter cmdlet, as you’ll see later in this chapter. Let’s begin our discussion with the conditional statement.

join today to enjoy all our content. all the time.
 

6.1. The Conditional Statement

PowerShell has one main conditional statement: the if statement shown in figure 6.2.

Figure 6.2. The syntax of the PowerShell conditional statements

This statement lets a script decide whether an action should be performed by evaluating a conditional expression, then selecting the path to follow based on the results of that evaluation. The PowerShell if statement is similar to the if statement found in most programming languages. The one thing that’s a bit different syntactically is the use of elseif as a single keyword for subsequent clauses. Figure 6.3 shows the structure and syntax of this statement in detail.

Figure 6.3. PowerShell’s version of the if statement, which is the basic conditional statement found in all scripting and programming languages

Let’s work through some examples that illustrate how the if statement works. You’ll use all three of the elements—if, elseif, and else—in this example:

if ($x -gt 100)
{
    "It's greater than one hundred"
}
elseif ($x -gt 50)
{
    "It's greater than 50"
} else
{
    "It's not very big."
}

In this example, if the variable $x holds a value greater than 100, the string “It’s greater than one hundred” will be emitted. If $x is greater than 50 but less than 100, it will emit “It’s greater than 50”; otherwise, you’ll get “It’s not very big.” Of course, you can have zero or more elseif clauses to test different things. The elseif and else parts are optional, as is the case in other languages.

As you might have noticed, the PowerShell if statement is modeled on the if statement found in C-derived languages, including C#, but a couple of differences exist. First, elseif is a single keyword with no spaces allowed between the words. Second, the braces are mandatory around the statement lists, even when you have only a single statement in the list (or no statements for that matter, in which case you would have to type {}). If you try to write something like

if ($x -gt 100) "It's greater than one hundred"

you’ll get a syntax error:

PS (1) > if ($x -gt 100) "It's greater than one hundred"
Missing statement block after if ( condition ).
At line:1 char:17
+ if ($x -gt 100) " <<<< It's greater than one hundred"
PS (2) >
Grammar lessons

The PowerShell grammar technically could support the construction shown in the preceding example. In fact, we did enable this feature at one point, but when people tried it out, it resulted in a lot of errors. The problem is that a newline or a semicolon is required to terminate a command. This leads to the situation where you write something like

if ($x -gt 3) write x is $x while ($x--) $x

and discover that, because you’ve missed the semicolon before the while statement, it writes out the while statement instead of executing it. In the end, the cost of typing a couple of additional characters was more than offset by a decreased error rate. For this reason, the language design team decided to make the braces mandatory.

In general, the syntax of the if statement (and all the PowerShell flow-control statements) is freeform with respect to whitespace. In other words, you can lay out your code pretty much any way you want. You can write an if statement that looks like this

if($true){"true"}else{"false"}

with no whitespace whatsoever. Alternatively, you could also write it like this

if
(
$true
)
{
"true"
}
else
{
"false"
}

where each element is on a separate line.

There’s one constraint on how you can format an if statement: when PowerShell is being used interactively, the else or elseif keyword has to be on the same line as the previous closing brace; otherwise, the interpreter will consider the statement complete and execute it immediately.

It’s important to note that the PowerShell if statement allows a pipeline in the condition clause, not just a simple expression. This means it’s possible to do the following:

if ( dir telly*.txt | select-string penguin )
{
    "There's a penguin on the telly."
}

In this example, the pipeline in the condition part of the if statement will scan all the text files whose names start with “telly” to see whether they contain the word “penguin.” If at least one of the files contains this word, the statement block will be executed, printing out

There's a penguin on the telly.

Here’s another example:

if (( dir *.txt | select-string -List spam ).Length -eq 3)
{
     "Spam! Spam! Spam!"
}

In this case, you search all the text files in the current directory looking for the word “spam.” If exactly three files contain this word, then you print out

Spam! Spam! Spam!
Note

Yes, these are, in fact, Monty Python references. This is where the Python language got its name. If you’re familiar with Python or Perl, you’ll occasionally recognize cultural references from those languages in PowerShell examples here and elsewhere. Many of the PowerShell development team members had their first scripting experiences with those languages.

Because you can use pipelines and subexpressions in the conditional part of an if statement, you can write quite complex conditional expressions in PowerShell. With subexpressions, you can even use an if statement inside the condition part of another if statement. Here’s what this looks like:

PS (2) > $x = 10
PS (3) > if ( $( if ($x -lt 5) { $false } else { $x } ) -gt
>>> 20) { $false } else {$true}
True
PS (4) > $x = 25
PS (5) > if ( $( if ($x -lt 5) { $false } else { $x } ) -gt
>>> 20) { $false } else {$true}
False
PS (6) > $x = 4
PS (7) > if ( $( if ($x -lt 5) { $false } else { $x } ) -gt
>>> 20) { $false } else {$true}
True
PS (8) >

If looking at this makes your head hurt, welcome to the club—it made mine hurt to write it. Let’s dissect this statement and see what it’s doing. Let’s take the inner if statement first:

if ($x -lt 5) { $false } else { $x }

You can see that this statement is straightforward. If $x is less than the number 5, it returns false; otherwise, it returns the value of $x. Based on this, let’s split the code into two separate statements:

$temp = $( if ($x -lt 5) { $false } else { $x } )
if ($temp -gt 20) { $false } else {$true}

What the outer if statement is doing is also pretty obvious: if the result of the first (formally inner) statement is greater than 20, return $false; otherwise return $true.

Now that you can do branching, let’s move on to the looping statements.

Get Windows PowerShell in Action, Second Edition
add to cart

6.2. Looping Statements

Looping is the ability to repeat a set of actions some specific number of times, either based on a count or a condition expression. The PowerShell loop statements cover both of these cases and are shown in figure 6.4.

Figure 6.4. The PowerShell loop statements

6.2.1. The while loop

In this section, we’ll cover the basic looping statement in PowerShell: the while statement. The while statement (also known as a while loop) is the most basic Power-Shell language construct for creating a loop. It executes the commands in the statement list as long as a conditional test evaluates to true. Figure 6.5 shows the while statement syntax.

Figure 6.5. The PowerShell while loop statement syntax

When you execute a while statement, PowerShell evaluates the <pipeline> section of the statement before entering the <statementList> section. The output from the pipeline is then converted to either true or false, following the rules for the Boolean interpretation of values described in chapter 3. As long as this result converts to true, PowerShell reruns the <statementList> section, executing each statement in the list.

For example, the following while statement displays the numbers 1 through 3:

$val = 0
while($val -ne 3)
{
    $val++
    write-host "The number is $val"
}

In this example, the condition ($val isn’t equal to 3) is true while $val is 0, 1, and 2. Each time through the loop, $val is incremented by 1 using the unary ++ increment operator ($val++). The last time through the loop, $val is 3. When $val equals 3, the condition statement evaluates to false and the loop exits.

To more conveniently enter this command at the PowerShell command prompt, you can simply enter it all on one line:

$val=0; while ($val -ne 3){$val++; write-host "The number is $val"}

Notice that the semicolon separates the first command that adds 1 to $val from the second command, which writes the value of $val to the console.

You can accomplish all the basic iterative patterns just using the while loop, but PowerShell provides several other looping statements for common cases. Let’s look at those next.

6.2.2. The do-while loop

The other while loop variant in PowerShell is the do-while loop. This is a bottom-tested variant of the while loop. In other words, it always executes the statement list at least once before checking the condition. The syntax of the do-while loop is shown in figure 6.6.

Figure 6.6. The PowerShell do-while loop statement syntax

The do-while loop is effectively equivalent to

<statementList>
while ( <pipeLine> )
{
        <statementList>
}

where the two statement lists are identical. The final variation of the while loop is the do/until statement. It’s identical to the do/while loop except that the sense of the test is inverted and the statement will loop until the condition is true instead of while it is true, as shown in this example:

PS (1) > $i=0
PS (2) > do { $i } until  ($i++ -gt 3)
0
1
2
3
4

In this case, the statement loops until $i is greater than 3.

Having covered the two variations of the while loop, we’ll look at the for and foreach loops next.

6.2.3. The for loop

The for loop is the basic counting loop in PowerShell. It’s typically used to step through a collection of objects. It’s not used as often in PowerShell as in other languages because there are usually better ways for processing a collection, as you’ll see with the foreach statement in the next section. But the for loop is useful when you need to know explicitly which element in the collection you’re working with. Figure 6.7 shows the for loop syntax.

Figure 6.7. The PowerShell for loop statement syntax

Notice that the three pipelines in the parentheses are just general pipelines. Conventionally, the initialization pipeline initializes the loop counter variable, the test pipeline tests this variable against some condition, and the increment pipeline increments the loop counter. The canonical example is

PS (1) > for ($i=0; $i -lt 5; $i++) { $i }
0
1
2
3
4
PS (2) >

But because these are arbitrary pipelines, they can do anything. (Note that if initialization and increment pipelines produce output, it’s simply discarded by the interpreter.) Here’s an example where the condition test is used to generate a side effect that’s then used in the statement list body:

PS (2) > for ($i=0; $($y = $i*2; $i -lt 5); $i++) { $y }
0
2
4
6
8
PS (3) >

In this example, the pipeline to be tested is a subexpression that first sets $y to be twice the current value of $i and then compares $i to 5. In the loop body, you use the value in $y to emit the current loop counter times 2. A more practical example would be initializing two values in the initialization pipeline:

PS (3) > for ($($result=@(); $i=0); $i -lt 5; $i++) {$result += $i }
PS (4) > "$result"
0 1 2 3 4

Here you use a subexpression in the initialization pipeline to set $result to the empty array and the counter variable $i to 0. Then the loop counts up to 5, adding each value to the result array.

Note

It’s a little funny to talk about the initialization and increment pipelines. You usually think of pipelines as producing some output. In the for statement, the output from these pipelines is discarded and the side effects of their execution are the interesting parts.

Now let’s look at one more example using the for loop. Here you’ll use it to sum up the number of handles used by the svchost processes. First you’ll get a list of these processes:

PS (1) > $svchosts = get-process svchost

You’ll loop through this list and add the handle count for the process to $total

PS (2) > for ($($total=0;$i=0); $i -lt $svchosts.count; $i++)
>> {$total+=$svchosts[$i].handles}
>>

and then print out the total:

PS (3) > $total
3457

So using the for loop is straightforward, but it’s somewhat annoying to have to manage the loop counter. Wouldn’t it be nice if you could just let the loop counter count take care of itself? That’s exactly what the foreach loop does for you, so let’s move on.

6.2.4. The foreach loop

Collections are important in any shell (or programming) environment. The whole point of using a scripting language for automation is so that you can operate on more than one object at a time. As you’ve seen in chapters 3 and 4, PowerShell provides many ways of operating on collections. Perhaps the most straightforward of these mechanisms is the foreach loop.

Note

Astute readers will remember that we mentioned a foreach cmdlet (which is an alias for the ForEach-Object cmdlet) as well as the foreach statement at the beginning of the chapter. To reiterate, when the word “foreach” is used at the beginning of a statement, it’s recognized as the foreach keyword. When it appears in the middle of a pipeline, it’s treated as the name of a command.

This statement is syntactically identical to the C# foreach loop with the exception that you don’t have to declare the type of the loop variable (in fact, you can’t do this). Figure 6.8 shows you the syntax for the foreach statement.

Figure 6.8. The PowerShell foreach loop statement syntax

Here’s an example. This example loops over all the text files in the current directory, calculating the total size of all the files:

$l = 0; foreach ($f in dir *.txt) { $l += $f.length }

First you set the variable that will hold the total length to 0. Then, in the foreach loop, you use the dir command to get a list of the text files in the current directory (that is, files with the .txt extension). The foreach statement assigns elements from this list one at a time to the loop variable $f and then executes the statement list with this variable set. At the end of the statement, $f will retain the last value that was assigned to it, which is the last value in the list. Compare this example to the for loop example at the end of the previous section. Because you don’t have to manually deal with the loop counter and explicit indexing, this example is significantly simpler.

Note

In C#, the foreach loop variable is local to the body of the loop and is undefined outside of the loop. This isn’t the case in Power-Shell; the loop variable is simply another variable in the current scope. After the loop has finished executing, the variable is still visible and accessible outside the loop and will be set to the last element in the list. If you do want to have a locally scoped variable, you can do this with scriptblocks, which are discussed in detail in chapter 8.

Now let’s use a variation of a previous example. Say you want to find out the number of text files in the current directory and the total length of those files. First you’ll initialize two variables: $c to hold the count of the files and $l to hold the total length:

PS (1) > $c=0
PS (2) > $l=0

Next run the foreach statement:

PS (3) > foreach ($f in dir *.txt) {$c += 1; $l += $f.length }

Finally display the results accumulated in the variables:

PS (4) > $c
5
PS (5) > $l
105
PS (6) >

Let’s look at the actual foreach statement in detail now. The <pipeline> part in this example is

dir *.txt

This produces a collection of System.IO.FileInfo objects representing the files in the current directory. The foreach statement loops over this collection, binding each object to the variable $f and then executing the loop body.

Evaluation order in the foreach loop

It’s important to note that this statement doesn’t stream the results of the pipeline. The pipeline to loop over is run to completion and only then does the loop body begin executing. Let’s take a second to compare this behavior with the way the ForEach-Object cmdlet works. Using the ForEach-Object cmdlet, this statement would look like

dir *.txt | foreach-object { $c += 1; $l += $_.length }

In the case of the ForEach-Object, the statement body is executed as soon as each object is produced. In the foreach statement, all the objects are collected before the loop body begins to execute. This has two implications.

First, because in the foreach statement case all the objects are gathered at once, you need to have enough memory to hold all these objects. In the ForEach-Object case, only one object is read at a time, so less storage is required. From this, you’d think that ForEach-Object should always be preferred. In the bulk-read case, though, there are some optimizations that the foreach statement does that allow it to perform significantly faster than the ForEach-Object cmdlet. The result is a classic speed versus space trade-off. In practice, you rarely need to consider these issues, so use whichever seems most appropriate to the solution at hand.

Note

The ForEach-Object cmdlet is covered later on in this chapter. For Ruby language fans, ForEach-Object is effectively equivalent to the .map() operator.

The second difference is that, in the ForEach-Object case, the execution of the pipeline element generating the object is interleaved with the execution of the ForEach-Object cmdlet. In other words, the command generates one object at a time and then passes it to foreach for processing before generating the next element. This means that the statement list can affect how subsequent pipeline input objects are generated.

Note

Unlike traditional shells where each command is run in a separate process and can therefore run at the same time, in PowerShell they’re alternating—the command on the left side runs and produces an object, and then the command on the right side runs.

Using the $foreach loop enumerator in the foreach statement

Executing the foreach statement also defines a special variable for the duration of the loop. This is the $foreach variable, and it’s bound to the loop enumerator. (An enumerator is a .NET object that captures the current position in a sequence of objects. The foreach statement keeps track of where it is in the collection through the loop enumerator.) By manipulating the loop enumerator, you can skip forward in the loop. Here’s an example:

PS (1) > foreach ($i in 1..10)
>> { [void] $foreach.MoveNext(); $i + $foreach.current }
>>
3
7
11
15
19
PS (2) >

In this example, the foreach loop iterates over the collection of numbers from 1 to 10. In the body of the loop, the enumerator is used to advance the loop to the next element. It does this by calling the $foreach.MoveNext() method and then retrieving the next value using $foreach.current. This lets you sum up each pair of numbers—(1,2), (3,4), and so on as the loop iterates.

Note

The foreach statement can be used to iterate over anything PowerShell considers enumerable. This typically includes anything that implements the .NET IEnumerable interface, but PowerShell adapts that slightly. In particular, there are some classes that implement IEnumerable that PowerShell doesn’t consider enumerable. This includes strings and dictionaries or hashtables. Because Power-Shell unravels collections freely, you don’t want a string to suddenly be turned into a stream of characters or a hashtable to be shredded into a sequence of key-value pairs. Hashtables in particular are commonly used as lightweight (that is, typeless) objects in the PowerShell environment, so you need to preserve their scalar nature.

The value stored in $foreach is an instance of an object that implements the [System.Collections.IEnumerator] interface. Here’s a quick example that shows you how to look at the members that are available on this object:

PS (1) > [System.Collections.IEnumerator].Getmembers()|foreach{"$_"}
Boolean MoveNext()
System.Object get_Current()
Void Reset()
System.Object Current
PS (2) >

The output of this statement shows the Current and MoveNext() members you’ve used. There’s also a Reset() member that will reset the enumerator back to the start of the collection.

One final thing you need to know about the foreach statement is how it treats scalar objects. Because of the way pipelines work, you don’t know ahead of time if the pipeline will return a collection or a single scalar object. In particular, if the pipeline returns a single object, you can’t tell if it’s returning a scalar or a collection consisting of one object. You can use the @( ... ) construction described in chapter 5 to force an array interpretation, but this ambiguity is common enough that the foreach statement takes care of this by itself. A scalar object in the foreach statement is automatically treated as a one-element collection:

PS (2) > foreach ($i in "hi") {$i }
hi

In this example, the value to iterate over is the scalar string “hi”. The loop executes exactly once, printing hi. This works great most of the time, but there’s one “corner case” that can cause some problems, as you’ll see in the next section.

The foreach loop and $null

Now here’s something that really surprises (and sometimes irritates) people. What happens if the value to iterate over is $null? Let’s find out:

PS (3) > foreach ($i in $null) { "executing" }
Executing

So the loop executes. This illustrates that PowerShell treats $null as a scalar value. Now compare this with the empty array:

PS (4) > foreach ($i in @()) { "executing" }
PS (5) >

This time it doesn’t execute. The empty array is unambiguously a collection with no elements, which is quite different from a collection having one member whose value is $null. In other words, @() and @($null) aren’t the same thing. For programmers who are used to $null being nothing, this is a jarring notion. So why does Power-Shell work this way? Let’s look at some more examples. First we’ll consider an example where you pass in an array of three nulls:

PS {6) > foreach ($i in $null, $null, $null) {"hi"}
hi
hi
hi

The statement prints hi three times because there were three elements in the array. Now use an array of two elements

PS {7) > foreach ($i in $null, $null) {"hi"}
hi
hi

and it prints hi twice. Logically, if there’s only one $null, it should loop exactly once

PS {8) > foreach ($i in $null) {"hi"}
hi

which is exactly what it does. PowerShell is deeply consistent, even in this case. This is not, though, the expected or even desired behavior in a foreach loop in many cases, so here’s how to work around it. You can use the Write-Output cmdlet (aliased to write) to preprocess the collection you want to iterate over. If the argument to Write-Output is $null, it doesn’t write anything to the output pipe:

PS {9) > foreach ($i in write $null) {"hi"}
PS {10) >

And you see that the loop didn’t execute. So let’s run through the previous example with the arrays of nulls. First, with three nulls

PS {10) > foreach ($i in write $null,$null,$null) {"hi"}
hi
hi
hi

and you get three iterations. Now with two

PS {11) > foreach ($i in write $null,$null) {"hi"}
hi
hi

and you get two iterations. Finally, with one $null

PS {12) > foreach ($i in write $null) {"hi"}
PS {13) >

and this time the loop doesn’t execute. Although this is inconsistent behavior, it matches user expectations and is a good trick to have in your toolkit.

Note

In the first edition of this book, I called this a corner case and suggested that most readers didn’t need to know about this. I was wrong. It comes up on a surprisingly regular basis. In fact, the workaround using Write-Output was suggested by a user, not by the PowerShell team. Let’s hear it for the community!

On that note, let’s move on to a slightly different topic and talk about break, continue, and using labeled loops to exit out of nested loop statements.

Sign in for more free preview time

6.3. Labels, Break, and Continue

In this section, we’ll discuss how to do nonstructured exits from the various looping statements using the break and continue statements shown in figure 6.9. We’ll also cover labeled loops and how they work with break and continue. But first, some history.

Figure 6.9. The PowerShell break and continue statements, which may optionally take a label indicating which loop statement to break to

In the dawn of computer languages, there was only one flow-control statement: goto. Although it was simple, it also resulted in programs that were hard to understand and maintain. Then along came structured programming. Structured programming introduced the idea of loops with single entry and exit points. This made programs much easier to understand and therefore maintain. Constructs such as while loops and if/then/else statements made it simpler to write programs that are easy to follow.

Note

For the academically inclined reader, Wikipedia.org has a nice discussion on the topic of structured programming.

So structured programming is great—that is, until you have to exit from a set of deeply nested while loops. That’s when pure structured programming leads to pathologically convoluted logic because you have to litter your program with Boolean variables and conditionals trying to achieve the flow of control you need. This is when being a little “impure” and allowing the use of unstructured flow-control elements (including the infamous goto statement) is useful. Now, PowerShell doesn’t actually have a goto statement. Instead, it has break and continue statements and labeled loops. Let’s look at some simple examples. Here’s a while loop that stops counting at 5:

PS (1) > $i=0; while ($true) { if ($i++ -ge 5) { break } $i }
1
2
3
4
5
PS (2) >

Notice in this example that the while loop condition is simply $true. Obviously, this loop would run forever were it not for the break statement. As soon as $i hits 5, the break statement is executed and the loop terminates. Now let’s look at the continue statement. In this example, you have a foreach loop that loops over the numbers from 1 to 10:

PS (1) > foreach ($i in 1..10)
>> {
>>     if ($i % 2)
>>     {
>>         continue
>>     }
>>     $i
>> }
>>
2
4
6
8
10
PS (2) >

If the number isn’t evenly divisible by 2, then the continue statement is executed. Where the break statement immediately terminates the loop, the continue statement causes the flow of execution to jump back to the beginning of the loop and move on to the next iteration. The end result is that only even numbers are emitted. The continue statement skips the line that would have printed the odd numbers.

So the basic break and continue statements can handle flow control in a single loop. But what about nested loops, which was the real problem you wanted to address? This is where labels come in. Before the initial keyword on any of PowerShell’s loop statements, you can add a label naming that statement. Then you can use the break and continue keywords to jump to that statement. Here’s a simple example:

:outer while (1)
{
    while(1)
     {
        break outer;
     }
}

In this example, without the break statement, the loop would repeat forever. Instead, the break will take you out of both the inner and outer loops.

Note

In PowerShell, labeled break and continue statements have one rather strange but occasionally useful characteristic: they’ll continue to search up the calling stack until a matching label is found. This search will even cross script and function call boundaries. This means that a break inside a function inside a script can transfer control to an enclosing loop in the calling script. This allows for wide-ranging transfer of control. This will make more sense when you get to chapter 7, where functions are introduced.

One last thing to know about the break and continue statements—the name of the label to jump to is actually an expression, not a constant value. You could, for example, use a variable to name the target of the statement. Let’s try this out. First set up a variable to hold the target name:

PS (1) > $target = "foo"

Now use it in a loop. In this loop, if the least significant bit in the value stored in $i is 1 (yet another way to test for odd numbers), you skip to the next iteration of the loop named by $target

PS (2) > :foo foreach ($i in 1..10) {
>> if ($i -band 1) { continue $target } $i
>> }
>>
2
4
6
8
10
PS (3) >

which produces a list of the even numbers in the range 1..10.

At this point, we’ve covered all of the basic PowerShell flow-control statements, as well as using labels and break/continue to do nonlocal flow-control transfers. Now let’s move on to the switch statement, which in PowerShell combines both looping and branching capabilities.

join today to enjoy all our content. all the time.
 

6.4. The Switch Statement

The switch statement, shown in figure 6.10, is the most powerful statement in the PowerShell language. This statement combines pattern matching, branching, and iteration all into a single control structure. This is why it gets its own section instead of being covered under either loops or conditionals.

Figure 6.10. The PowerShell switch statement syntax

At the most basic level, the switch statement in PowerShell is similar to the switch statement in many other languages—it’s a way of selecting an action based on a particular value. But the PowerShell switch statement has a number of additional capabilities. It can be used as a looping construct where it processes a collection of objects instead of just a single object. It supports the advanced pattern matching features that you’ve seen with the -match and -like operators. (How the pattern is matched depends on the flags specified to the switch statement.) Finally, it can be used to efficiently process an entire file in a single statement.

6.4.1. Basic use of the switch statement

Let’s begin by exploring the basic functions of the switch statement. See figure 6.11 for a look at its syntax in detail.

Figure 6.11. The PowerShell switch statement syntax. The switch options control how matching is done. These options are -regex, -wildcard, -match, and -case. The pipeline produces values to switch on; alternatively, you can specify the sequence -file <expr> instead of ( <pipeline> ). All matching pattern/action clauses are executed; the default clause is executed only if there are no other matches.

This is a pretty complex construct, so let’s start by looking at the simplest form of the statement. Here’s the basic example:

PS (1) > switch (1) { 1 { "One" } 2 { "two" } }
One

The value to switch on is in the parentheses after the switch keyword. In this example, it’s the number 1. That value is matched against the pattern in each clause and all matching actions are taken. You’ll see how to change this in a second.

In this example, the switch value matches 1 so that clause emits the string “one”. Of course, if you change the switch value to 2, you get

PS (2) > switch (2) { 1 { "One" } 2 { "two" } }
two

Now try a somewhat different example. In this case, you have two clauses that match the switch value:

PS (4) > switch (2) { 1 { "One" } 2 { "two" } 2 {"another 2"} }
two
another 2

You can see that both of these actions are executed. As we stated earlier, the switch statement executes all clauses that match the switch value. If you want to stop at the first match, you use the break statement:

PS (5) > switch (2) {1 {"One"} 2 {"two"; break} 2 {"another 2"}}
two

This causes the matching process to stop after the first matching statement was executed. But what happens if no statements match? Well, the statement quietly returns nothing:

PS (6) > switch (3) { 1 { "One" } 2 { "two"; break } 2 {"another 2"} }
PS (7) >

To specify a default action, you can use the default clause:

PS (7) > switch (3) { 1 { "One" } 2 { "two" } default {"default"} }
default
PS (8) > switch (2) { 1 { "One" } 2 { "two" } default {"default"} }
Two

In this example, when the switch value is 3, no clause matches and the default clause is run. But when there’s a match, the default isn’t run, as it’s not considered a match. This covers the basic mode of operation. Now let’s move on to more advanced features.

6.4.2. Using wildcard patterns with the switch statement

By default, the matching clauses make an equivalence comparison against the object in the clause. If the matching object is a string, the check is done in a case-insensitive way, as you see in the next example:

PS (1) > switch ('abc') {'abc' {"one"} 'ABC' {"two"}}
one
two

The switch value “abc” in this example was matched by both “abc” and “ABC”. You can change this behavior by specifying the -casesensitive option:

PS (2) > switch -case ('abc') {'abc' {"one"} 'ABC' {"two"}}
one

Now the match occurs only when the case of the elements match.

Note

In this example, we only used the prefix -case instead of the full option string. In fact, only the first letter of the option is checked.

Next, let’s discuss the next switch option, the -wildcard option. When -wildcard is specified, the switch value is converted into a string and the tests are conducted using the wildcard pattern. (Wildcard patterns were discussed in chapter 4 with the -like operator.) This is shown in the next example:

PS (4) > switch -wildcard ('abc') {a* {"astar"} *c {"starc"}}
astar
starc

In this example, the pattern a* matches anything that begins with the letter “a” and the pattern *c matches anything that ends with the letter “c.” Again, all matching clauses are executed.

There’s one more element to mention at this point. When a clause is matched, the element that matched is assigned to the variable $_ before running the clause. This is always done, even in the simple examples we discussed earlier, but it wasn’t interesting because you were doing exact comparisons so you already knew what matched. Once you introduce patterns, it’s much more useful to be able to get at the object that matched. For example, if you’re matching against filename extensions, you’d want to be able to get at the full filename to do any processing on that file. We’ll look at some more practical uses for this feature in later sections. For now, here’s a basic example that shows how this match works:

PS (5) > switch -wildcard ('abc') {a* {"a*: $_"} *c {"*c: $_"}}
a*: abc
*c: abc

In the result strings, you can see that $_ was replaced by the full string of the actual switch value.

6.4.3. Using regular expressions with the switch statement

As we discussed in chapter 4, the wildcard patterns, while useful, have limited capabilities. For more sophisticated pattern matching, you used regular expressions.

Regular expressions are available in the switch statement through the -regex flag. Let’s rewrite the previous example using regular expressions instead of wildcards:

PS (6) > switch -regex ('abc') {ˆa {"a*: $_"} 'c$' {"*c: $_"}}
a*: abc
*c: abc

As you see, $_ is still bound to the entire matching key. But one of the most powerful features of regular expressions is submatches. A submatch, or capture, is a portion of the regular expression that’s enclosed in parentheses, as discussed in chapter 4 with the -match operator. With the -match operator, the submatches are made available through the $matches variable. This same variable is also used in the switch statement. The next example shows how this works:

PS (8) > switch -regex ('abc') {'(ˆa)(.*$)' {$matches}}

Key                            Value
---                            -----
2                              bc
1                              a
0                              abc

In the result shown here, $matches[0] is the overall key; $matches[1] is the first submatch, in this case the leading “a”; and $matches[2] is the remainder of the string. As always, matching is case insensitive by default, but you can specify the -case option to make it case sensitive, as shown here:

PS (9) > switch -regex ('abc') {'(ˆA)(.*$)' {$matches}}

Key                            Value
---                            -----
2                              bc
1                              a
0                              abc

PS (10) > switch -regex -case  ('abc') {'(ˆA)(.*$)' {$matches}}

In the first command, you changed the match pattern from a to A and the match still succeeded because case was ignored. In the second command, you added the -case flag and this time the match didn’t succeed.

So far we’ve discussed three ways to control how matching against the switch value works—in other words, three matching modes (actually six, because the -case flag can be used with any of the previous three). But what if you need something a bit more sophisticated than a simple pattern match? The switch statement lets you handle this by specifying an expression in braces instead of a pattern. In the next example, you specify two expressions that check against the switch value. Again the switch value is made available through the variable $_:

PS (11) > switch (5) {
>> {$_ -gt 3} {"greater than three"}
>> {$_ -gt 7} {"greater than 7"}}
>>
greater than three
PS (12) > switch (8) {
>> {$_ -gt 3} {"greater than three"}
>> {$_ -gt 7} {"greater than 7"}}
>>
greater than three
greater than 7
PS (13) >

In the first statement, only the first clause was triggered because 5 is greater than 3 but less than 7. In the second statement, both clauses fired.

You can use these matching clauses with any of the other three matching modes:

PS (13) > switch (8) {@
>> {$_ -gt 3} {"greater than three"}
>> 8 {"Was $_"}}
>>
greater than three
Was 8

The first expression, {$_ -gt 3}, evaluated to true so “greater than three” was printed, and the switch value matched 8 so “Was 8” also printed (where $_ was replaced by the matching value).

Now you have exact matches, pattern matches, conditional matches, and the default clause. But what about the switch value itself? So far, all the examples have been simple scalar values. What happens if you specify a collection of values? This is where the switch statement acts like a form of loop.

Note

switch works like the other looping statements in that the expression in the parentheses is fully evaluated before it starts iterating over the individual values.

Let’s look at another example where you specify an array of values:

PS (2) > switch(1,2,3,4,5,6) {
>> {$_ % 2} {"Odd $_"; continue}
>> 4 {"FOUR"}
>> default {"Even $_"}
>> }
>>
Odd 1
Even 2
Odd 3
FOUR
Odd 5
Even 6

In this example, the switch value is 1,2,3,4,5,6. The switch statement loops over the collection, testing each element against all the clauses. The first clause returns “Odd $_” if the current switch element isn’t evenly divisible by 2. The next clause prints out “FOUR” if the value is 4. The default clause prints out “Even $_” if the number is even. Note the use of continue in the first clause. This tells the switch statement to stop matching any further clauses and move on to the next element in the collection. In this instance, the switch statement is working in the same way that the continue statement works in the other loops. It skips the remainder of the body of the loop and continues on with the next loop iteration. What happens if you used break instead of continue?

PS (3) > switch(1,2,3,4,5,6) {
>> {$_ % 2} {"Odd $_"; break}
>> 4 {"FOUR"}
>> default {"Even $_"}
>> }
>>
Odd 1

As with the other loops, break doesn’t just skip the remainder of the current iteration; it terminates the overall loop processing. (If you want to continue iterating, use continue instead. More on that later.)

Of course, iterating over a fixed collection isn’t very interesting. In fact, you can use a pipeline in the switch value, as the next example shows. In this example, you want to count the number of DLLs, text files, and log files in the directory c:\windows. First you initialize the counter variables:

PS (1) > $dll=$txt=$log=0

Now you run the actual switch statement. This switch statement uses wildcard patterns to match the extensions on the filenames. The associated actions increment a variable for each extension type:

PS (2) > switch -wildcard (dir c:\windows)
>> {*.dll {$dll++} *.txt {$txt++} *.log {$log++}}

Once you have the totals, display them:

PS (3) > "dlls: $dll text files: $txt log files: $log"
dlls: 6 text files: 9 log files: 120

Note that in this example the pipeline element is being matched against every clause. Because a file can’t have more than one extension, this doesn’t affect the output, but it does affect performance somewhat. It’s faster to include a continue statement after each clause so the matching process stops as soon as the first match succeeds.

Here’s something else we glossed over earlier in our discussion of $_—it always contains the object that was matched against. This is important to understand when you’re using the pattern matching modes of the switch statement. The pattern matches create a string representation of the object to match against, but $_ is still bound to the original object. Here’s an example that illustrates this point. This is basically the same as the previous example, but this time, instead of counting the number of files, you want to calculate the total size of all the files having a particular extension. Here are the revised commands:

PS (1) > $dll=$txt=$log=0
PS (2) > switch -wildcard (dir) {
>> *.dll {$dll+= $_.length; continue}
>> *.txt {$txt+=$_.length; continue}
>> *.log {$log+=$_.length; continue}
>> }
>>
PS (3) > "dlls: $dll text files: $txt log files: $log"
dlls: 166913 text files: 1866711 log files: 6669437
PS (4) >

Notice how you’re using $_.length to get the length of the matching file object. If $_ were bound to the matching string, you’d be counting the lengths of the filenames instead of the lengths of the actual files.

6.4.4. Processing files with the switch statement

There’s one last mode of operation for the switch statement to discuss: the -file option. Instead of specifying an expression to iterate over as the switch value, the -file option allows you to name a file to process. Here’s an example that processes the Windows update log file. Again start by initializing the counter variables:

PS (1) > $au=$du=$su=0

Next use the -regex and -file options to access and scan the file Windows-Update.log, and check for update requests from Windows Update, Windows Defender, and SMS:

PS (2) > switch -regex -file c:\windows\windowsupdate.log {
>> 'START.*Finding updates.*AutomaticUpdates' {$au++}
>> 'START.*Finding updates.*Defender' {$du++}
>> 'START.*Finding updates.*SMS' {$su++}
>> }
>>

Print the results:

PS (3) > "Automatic:$au Defender:$du SMS:$su"
Automatic:195 Defender:10 SMS:34

Now it’s possible to do basically the same thing by using Get-Content or even the file system name trick you learned in chapter 4:

PS (4) > $au=$du=$su=0
PS (5) > switch -regex (${c:windowsupdate.log}) {
>> 'START.*Finding updates.*AutomaticUpdates' {$au++}
>> 'START.*Finding updates.*Defender' {$du++}
>> 'START.*Finding updates.*SMS' {$su++}
>> }
>>
PS (6) > "Automatic:$au Defender:$du SMS:$su"
Automatic:195 Defender:10 SMS:34

This code uses ${c:windowsupdate.log} to access the file content instead of -file. So why have the -file option? There are two reasons.

The -file operation reads one line at a time, so it uses less memory than the Get-Content cmdlet, which has to read the entire file into memory before processing. Also, because -file is part of the PowerShell language, the interpreter can do some optimizations, which gives -file performance advantages.

So, overall, the -file option can potentially give you both speed and space advantages in some cases (the space advantage typically being the more significant, and therefore the more important of the two). When your task involves processing a lot of text files, the -file switch can be a useful tool.

6.4.5. Using the $switch loop enumerator in the switch statement

One more point: just as the foreach loop used $foreach to hold the loop enumerator, the switch statement uses $switch to hold the switch loop enumerator. This is useful in a common pattern—processing a list of options. Say you have a list of options where the option -b takes an argument and -a, -c, and -d don’t. You’ll write a switch statement to process a list of these arguments. First set up a list of test options. For convenience, start with a string and then use the -split operator to break it into an array of elements:

PS (1) > $options= -split "-a -b Hello -c"

Next initialize the set of variables that will correspond to the flags:

PS (2) > $a=$c=$d=$false
PS (3) > $b=$null

Now you can write your switch statement. The interesting clause is the one that handles -b. This clause uses the enumerator stored in $switch to advance the item being processed to the next element in the list. Use a cast to [void] to discard the return value from the call to $switch.movenext() (more on that later). Then use $switch.current to retrieve the next value and store it in $b. The loop continues processing the remaining arguments in the list.

PS (4) > switch ($options)
>> {
>> '-a' { $a=$true }
>> '-b' { [void] $switch.movenext(); $b= $switch.current }
>> '-c' { $c=$true }
>> '-d' { $d=$true }
>> }
>>

The last step in this example is to print the arguments in the list to make sure they were all set properly:

PS (5) > "a=$a b=$b c=$c d=$d"
a=True b=Hello c=True d=False
PS (6) >

You see that $a and $c are true, $b contains the argument “Hello”, and $d is still false because it wasn’t in your list of test options. The option list has been processed correctly.

Note

This isn’t a robust example because it’s missing all error handing. In a complete example, you’d have a default clause that generated errors for unexpected options. Also, in the clause that processes the argument for -b, rather than discarding the result of MoveNext() it should check the result and generate an error if it returns false. This would indicate that there are no more elements in the collection, so -b would be missing its mandatory argument.

This finishes the last of the flow-control statements in the PowerShell language, but as you saw at the beginning of this chapter, there’s another way to do selection and iteration in PowerShell by using cmdlets. In the next section, we’ll go over a couple of the cmdlets that are a standard part of the PowerShell distribution. These cmdlets let you control the flow of your script in a manner similar to the flow-control statements. (In later sections, we’ll look at how you can create your own specialized flow-control elements in PowerShell.)

Sign in for more free preview time

6.5. Flow Control Using Cmdlets

PowerShell’s control statements are part of the language proper, but there are also some cmdlets, shown in figure 6.12, that can be used to accomplish similar kinds of things.

Figure 6.12. Flow-control cmdlets

These cmdlets use blocks of PowerShell script enclosed in braces to provide the “body” of the control statement. These pieces of script are called scriptblocks and are described in detail in chapter 8. The two most frequent flow-control cmdlets that you’ll encounter are ForEach-Object and Where-Object.

6.5.1. The ForEach-Object cmdlet

The ForEach-Object cmdlet operates on each object in a pipeline in much the same way that the foreach statement operates on the set of values that are provided to it. For example, here’s a foreach statement that prints the size of each text file in the current directory:

PS (1) > foreach ($f in dir *.txt) { $f.length }
48
889
23723
328
279164

Using the ForEach-Object cmdlet, the same task can be accomplished this way:

PS (2) > dir *.txt | foreach-object {$_.length}
48
889
23723
328
279164

The results are the same, so what’s the difference? One obvious difference is that you don’t have to create a new variable name to hold the loop value. The automatic variable $_ is used as the loop variable.

Note

Automatic variables are common in scripting languages. These variables aren’t directly assigned to in scripts. Instead, they are set as the side effect of an operation. One of the earlier examples of this is in AWK. When a line is read in AWK, the text of the line is automatically assigned to $0. The line is also split into fields. The first field is placed in $1, the second is in $2, and so on. The Perl language is probably the most significant user of automatic variables. In fact, as mentioned previously, Perl inspired the use of $_ in PowerShell. Automatic variables can help reduce the size of a script, but they can also make a script hard to read and difficult to reuse because your use of automatics may collide with mine. From a design perspective, our approach with automatic variables follows the salt curve. A little salt makes everything taste better. Too much salt makes food inedible. The language design team tried to keep the use of automatics in PowerShell at the “just right” level. Of course, this is always a subjective judgment. Some people really like salt.

A more subtle difference, as discussed previously, is that the loop is processed one object at a time. In a normal foreach loop, the entire list of values is generated before a single value is processed. In the ForEach-Object pipeline, each object is generated and then passed to the cmdlet for processing.

The ForEach-Object cmdlet has an advantage over the foreach loop in the amount of space being used at a particular time. For example, if you’re processing a large file, the foreach loop would have to load the entire file into memory before processing. When you use the ForEach-Object cmdlet, the file will be processed one line at a time. This significantly reduces the amount of memory needed to accomplish a task.

You’ll end up using the ForEach-Object cmdlet a lot in command lines to perform simple transformations on objects (you’ve already used it in many examples so far). Given the frequency of use, there are two standard aliases for this cmdlet. The first one is (obviously) foreach. But wait a second—didn’t we say earlier in this chapter that foreach is a keyword and keywords can’t be aliased? This is true, but remember, keywords are only special when they’re the first unquoted word in a statement (in other words, not a string). If they appear anywhere else (for example, as an argument or in the middle of a pipeline), they’re just another command with no special meaning to the language. Here’s another way to think about it: the first word in a statement is the key that the PowerShell interpreter uses to decide what kind of statement it’s processing, hence the term “keyword.”

This positional constraint is how the interpreter can distinguish between the keyword foreach

foreach ($i in 1..10) { $i }

and the aliased cmdlet foreach:

1..10 | foreach {$_}

When foreach is the first word in a statement, it’s a keyword; otherwise it’s the name of a command.

Now let’s look at the second alias. Even though foreach is significantly shorter than ForEach-Object, there have still been times when users wanted it to be even shorter.

Note

Users wanted to get rid of this notation entirely and have foreach be implied by an open brace following the pipe symbol. This would have made about half of PowerShell users very happy. Unfortunately, the other half were adamant that the implied operation be Where-Object instead of ForEach-Object.

Where extreme brevity is required, there’s a second built-in alias that’s simply the percent sign (%). Now readers are saying, “You told us the percent sign is an operator!” Well, that’s true, but only when it’s used as a binary operator. If it appears as the first symbol in a statement, it has no special meaning, so you can use it as an alias for ForEach-Object. As with keywords, operators are also context sensitive.

The % alias you write results in very concise (but occasionally hard-to-read) statements such as the following, which prints the numbers from 1 to 5, times 2:

PS (1) > 1..5|%{$_*2}
2
4
6
8
10
PS (2) >

Clearly this construction is great for interactive use where brevity is important, but it probably shouldn’t be used when writing scripts. The issue is that ForEach-Object is so useful that a single-character symbol for it, one that is easy to distinguish, is invaluable for experienced PowerShell users. But unlike the word foreach, % isn’t immediately meaningful to new users. So this notation is great for “conversational” PowerShell, but generally terrible for scripts that you want other people to be able to read and maintain.

The last thing to know about the ForEach-Object cmdlet is that it can take multiple scriptblocks. If three scriptblocks are specified, the first one is run before any objects are processed, the second is run once for each object, and the last is run after all objects have been processed. This is good for conducting accumulation-type operations. Here’s another variation that sums the number of handles used by the service host svchost processes:

PS (3) > gps svchost |%{$t=0}{$t+=$_.handles}{$t}
3238

The standard alias for Get-Process is gps. This is used to get a list of processes where the process name matches svchost. These process objects are then piped into ForEach-Object, where the handle counts are summed up in $t and then emitted in the last scriptblock. This example uses the % alias to show how concise these expressions can be. In an interactive environment, brevity is important.

And here’s something to keep in mind when using ForEach-Object. The ForEach-Object cmdlet works like all cmdlets: if the output object is a collection, it gets unraveled. One way to suppress this behavior is to use the unary comma operator. For example, in the following, you assign $a an array of two elements, the second of which is a nested array:

PS (1) > $a = 1,(2,3)

When you check the length, you see that it is 2 as expected

PS (2) > $a.length
2

and the second element is still an array:

PS (3) > $a[1]
2
3

But if you run it through ForEach-Object, you’ll find that the length of the result is now 3, and the second element in the result is the number 2:

PS (4) > $b = $a | foreach { $_ }
PS (5) > $b.length
3
PS (6) > $b[2]
2

In effect, the result has been “flattened.” But if you use the unary comma operator before the $_ variable, the result has the same structure as the original array:

PS (7) > $b = $a | foreach { , $_ }
PS (8) > $b.length
2
PS (9) > $b[1]
2
3

When chaining foreach cmdlets, you need to repeat the pattern at each stage:

PS (7) > $b = $a | foreach { , $_ } | foreach { , $_ }
PS (8) > $b.length
2
PS (9) > $b[1]
2
3

Why don’t you just preserve the structure as you pass the elements through instead of unraveling by default? Well, both behaviors are, in fact, useful. Consider the following example, which returns a list of loaded module names:

Get-Process | %{$_.modules} | sort -u modulename

Here the unraveling is exactly what you want. When we were designing PowerShell, we considered both cases; and in applications, on average, unraveling by default was usually what we needed. Unfortunately, it does present something of a cognitive bump that surprises users learning to use PowerShell.

Using the return statement with ForEach-Object

Here’s another tidbit of information about something that occasionally causes problems. Although the ForEach-Object cmdlet looks like a PowerShell statement, remember that it is in fact a command and the body of code it executes is a script-block, also known as an anonymous function. (By anonymous, we just mean that we haven’t given it a name. Again, we cover this in detail in chapter 11.) The important thing to know is that the return statement (see chapter 7), when used in the script-block argument to ForEach-Object, only exits from the ForEach-Object script-block, not from the function or script that is calling ForEach-Object. So, if you do want to return out of a function or script in a foreach loop, either use the foreach statement where the return will work as desired, or use the nonlocal labeled break statement discussed earlier in this chapter.

How ForEach-Object processes its arguments

Let’s talk for a moment about how the ForEach-Object cmdlet processes its argument scriptblocks. A reader of the first edition of this book observed what he thought was an inconsistency between how the cmdlet is documented and how the following example behaves:

$words | ForEach-Object {$h=@{}} {$h[$_] += 1}

The help text for the cmdlet (use help ForEach-Object -Full to see this text) says that the -Process parameter is the only positional parameter and that it’s in position 1. Therefore, according to the help file, since the -Begin parameter isn’t positional, the example shouldn’t work. This led the reader to assume that either there was an error in the help file, or that he misunderstood the idea of positional parameters.

In fact the help file is correct (because the cmdlet information is extracted from the code) but the way it works is tricky.

If you look at the signature of the -Process parameter, you’ll see that, yes, it is positional, but it also takes a collection of scriptblocks and receives all remaining unbound arguments. So, in the case of

dir | foreach {$sum=0} {$sum++} {$sum}

the -Process parameter is getting an array of three scriptblocks, whereas -Begin and -End are empty. Now here’s the trick. If -Begin is empty and -Process has more than two scriptblocks in the collection, then the first one is treated as the -Begin scriptblock and the second one is treated as the -Process scriptblock. If -Begin is specified but -End is not and there are two scriptblocks, then the first one is treated as the Process clause and the second one is the End clause. Finally, if both -Begin and -End are specified, the remaining arguments will be treated as multiple Process clauses. This allows

dir | foreach {$sum=0} {$sum++} {$sum}
dir | foreach -begin {$sum=0} {$sum++} {$sum}
dir | foreach {$sum=0} {$sum++} -end {$sum}
dir | foreach -begin {$sum=0} {$sum++} -end {$sum}

and

dir | foreach -begin {$sum=0} -process {$sum++} -end {$sum}

to all work as expected.

On that note, we’re finished with our discussion of ForEach-Object. We’ll touch on it again in chapter 8 when we discuss scriptblocks, but for now, let’s move on to the other flow-control cmdlet commonly used in PowerShell (which, by the way, also uses scriptblocks—you may detect a theme here).

6.5.2. The Where-Object cmdlet

The other common flow-control cmdlet is the Where-Object cmdlet. This cmdlet is used to select objects from a stream, kind of like a simple switch cmdlet. It takes each pipeline element it receives as input, executes its scriptblock (see!) argument, passing in the current pipeline element as $_, and then, if the scriptblock evaluates to true, the element is written to the pipeline. We’ll show this with yet another way to select even numbers from a sequence of integers:

PS (4) > 1..10 | where {-not ($_ -band 1)}
2
4
6
8
10

The scriptblock enclosed in the braces receives each pipeline element, one after another. If the least significant bit in the element is 1, then the scriptblock returns the logical complement of that value ($false) and that element is discarded. If the least significant bit is 0, the logical complement of that is $true and the element is written to the output pipeline. Notice that the common alias for Where-Object is simply where. And, as with ForEach-Object, because this construction is so commonly used interactively, there’s an additional alias, which is simply the question mark (?). This allows the previous example to be written as

PS (5) > 1..10|?{!($_-band 1)}
2
4
6
8
10

Again, this is brief, but it looks like the cat walked across the keyboard (trust me on this one). So, as before, although this is fine for interactive use, it isn’t recommended in scripts because it’s hard to understand and maintain. As another, more compelling example of “Software by Cats,” here’s a pathological example that combines elements from the last few chapters—type casts, operators, and the flow-control cmdlets—to generate a list of strings of even-numbered letters in the alphabet, where the length of the string matches the ordinal number in the alphabet (“A” is 1, “B” is 2, and so on):

PS (1) > 1..26|?{!($_-band 1)}|%{[string][char]([int][char]'A'+$_-1)*$_}
>>
BB
DDDD
FFFFFF
HHHHHHHH
JJJJJJJJJJ
LLLLLLLLLLLL
NNNNNNNNNNNNNN
PPPPPPPPPPPPPPPP
RRRRRRRRRRRRRRRRRR
TTTTTTTTTTTTTTTTTTTT
VVVVVVVVVVVVVVVVVVVVVV
XXXXXXXXXXXXXXXXXXXXXXXX
ZZZZZZZZZZZZZZZZZZZZZZZZZZ
PS (2) >

The output is fairly self-explanatory, but the code isn’t. Figuring out how this works is left as an exercise to the reader and as a cautionary tale not to foist this sort of rubbish on unsuspecting coworkers. They know where you live.

Where-Object and Get-Content’s -ReadCount Parameter

On occasion, a question comes up about the Get-Content cmdlet and how its -ReadCount parameter works. This can be an issue particularly when using this cmdlet and parameter with Where-Object to filter the output of Get-Content. The issue comes up when the read count is greater than 1. This causes PowerShell to act as if some of the objects returned from Get-Content are being skipped and affects both ForEach-Object and Where-Object. After all, these cmdlets are supposed to process or filter the input one object at a time and this isn’t what appears to be happening.

Here’s what’s going on. Unfortunately the -ReadCount parameter has a confusing name. From the PowerShell user’s perspective, it has nothing to do with reading. What it does is control the number for records written to the next pipeline element, in this case Where-Object or ForEach-Object. The following examples illustrate how this works. In these examples, you’ll use a simple text file named test.txt, which contains 10 lines of text and the ForEach-Object cmdlet (through its alias %) to count the length of each object being passed down the pipeline. You’ll use the @( ... ) construct to guarantee that you’re always treating $_ as an array. Here are the examples with -readcount varying from 1 to 4:

PS (119) > gc test.txt -ReadCount 1 | % { @($_).count } | select -fir 1
1
PS (120) > gc test.txt -ReadCount 2 | % { @($_).count } | select -fir 1
2
PS (121) > gc test.txt -ReadCount 3 | % { @($_).count } | select -fir 1
3
PS (122) > gc test.txt -ReadCount 4 | % { @($_).count } | select -fir 1
4

In each case where -ReadCount is greater than 1, the variable $_ is set to a collection of objects where the object count of that collection is equivalent to the value specified by -ReadCount. In another example, you’ll use ForEach-Object to filter the pipeline:

PS (127) > gc test.txt -read 5 | ? {$_ -like '*'} | % { $_.count }
5
5

You can see that the filter result contains two collections of 5 objects each written to the pipeline for a total of 10 objects. Now use ForEach-Object and the if statement to filter the list:

PS (128) > (gc test.txt -read 10 | % {if ($_ -match '.') {$_}} |
>>> Measure-Object).count
>>>
10

This time you see a count of 10 because the value of $_ in the ForEach-Object cmdlet is unraveled when written to the output pipe. And now let’s look at one final example using Where-Object:

PS (130) > (gc test.txt -read 4 | %{$_} | where {$_ -like '*a*'} |
>>> Measure-Object).count
>>>
10

Here you’ve inserted one more ForEach-Object command between the gc and the Where-Object, which simply unravels the collections in $_ and so you again see a count of 10.

Note

Here’s the annoying thing: from the Get-Content developer’s perspective, it actually is doing a read of -ReadCount objects from the provider. Get-Content reads -ReadCount objects and then writes them as a single object to the pipeline instead of unraveling them. (I suspect that this is a bug that’s turned into a feature.) Anyway, the name makes perfect sense to the developer and absolutely no sense to the user. This is why developers always have to be aware of the user’s perspective even if it doesn’t precisely match the implementation details.

In summary, whenever -ReadCount is set to a value greater than 1, usually for performance reasons, object collections are sent through the pipeline to Where-Object instead of individual objects. As a result, you have to take extra steps to deal with unraveling the batched collections of objects.

At this point we’ve covered the two main flow-control cmdlets in detail. We’ve discussed how they work, how they can be used, and some of the benefits (and pitfalls) you’ll encounter when using them. An important point to note is that there’s nothing special about these cmdlets—they can be implemented by anyone and require no special access to the inner workings of the PowerShell engine. This is a characteristic we’ll explore in later chapters where you’ll see how you can take advantage of it. In the meantime, let’s look at one final feature of the PowerShell language: the ability to use all these statements we’ve been talking about as expressions that return values. Although not unique to PowerShell, this feature may seem a bit unusual to people who are used to working with languages like VBScript or C#. Let’s take a look.

Tour livebook

Take our tour and find out more about liveBook's features:

  • Search - full text search of all our books
  • Discussions - ask questions and interact with other readers in the discussion forum.
  • Highlight, annotate, or bookmark.
take the tour

6.6. Statements as Values

Let’s return to something we discussed a bit earlier when we introduced subexpressions in chapter 5—namely, the difference between statements and expressions. In general, statements don’t return values, but if they’re used as part of a subexpression (or a function or script as you’ll see later on), they do return a result. This is best illustrated with an example. Assume that you didn’t have the range operator and wanted to generate an array of numbers from 1 to 10. Here’s the traditional approach you might use in a language such as C#:

PS (1) > $result = new-object System.Collections.ArrayList
PS (2) > for ($i=1; $i -le 10; $i++) { $result.Append($i) }
PS (3) > "$($result.ToArray())"
1 2 3 4 5 6 7 8 9 10

First you create an instance of System.Collections.ArrayList to hold the result. Then you use a for loop to step through the numbers, adding each number to the result ArrayList. Finally you convert the ArrayList to an array and display the result. This is a straightforward approach to creating the array, but requires several steps. Using loops in subexpressions, you can simplify it quite a bit. Here’s the rewritten example:

PS (4) > $result = $(for ($i=1; $i -le 10; $i++) {$i})
PS (5) > "$result"
1 2 3 4 5 6 7 8 9 10

Here you don’t have to initialize the result or do explicit adds to the result collection. The output of the loop is captured and automatically saved as a collection by the interpreter. In fact, this is more efficient than the previous example, because the interpreter can optimize the management of the collection internally. This approach applies to any kind of statement. Let’s look at an example where you want to conditionally assign a value to a variable if it doesn’t currently have a value. First verify that the variable has no value:

PS (1) > $var

Now do the conditional assignment. This uses an if statement in a subexpression:

PS (2) > $var = $(if (! $var) { 12 } else {$var})
PS (3) > $var
12

From the output, you can see that the variable has been set. Change the variable, and rerun the conditional assignment:

PS (4) > $var="Hello there"
PS (5) > $var = $(if (! $var) { 12 } else {$var})
PS (6) > $var
Hello there

This time the variable isn’t changed.

For PowerShell version 2, the ability to assign the output of a flow-control statement has been simplified so you can directly assign the output to a variable. Although this doesn’t add any new capabilities, it does make things simpler and cleaner. For instance, the previous example can be simplified to

PS (7) > $var = if (! $var) { 12 } else {$var}

using this feature. And the for example you saw earlier can be simplified to

PS (4) > $result = for ($i=1; $i -le 10; $i++) {$i}

making it (somewhat) easier to read.

Used judiciously, the fact that statements can be used as value expressions can simplify your code in many circumstances. By eliminating temporary variables and extra initializations, creating collections is greatly simplified, as you saw with the for loop. On the other hand, it’s entirely possible to use this statement-as-expression capability to produce scripts that are hard to read. (Remember the nested if statement example we looked at earlier in this chapter?) You should always keep that in mind when using these features in scripts. The other thing to keep in mind when you use statements is the performance of your scripts. Let’s dig into this in a bit more detail.

join today to enjoy all our content. all the time.
 

6.7. A Word About Performance

Now that we’ve covered loops in PowerShell, this is a good time to talk about performance. PowerShell is an interpreted language, which has performance implications. Tasks with a lot of small repetitive actions can take a long time to execute. Anything with a loop statement can be a performance hotspot for this reason. Identifying these hotspots and rewriting them can have a huge impact on script performance. Let’s take a look at a real example. I was writing a script to process a collection of events, extracting events having a specific name and ID and placing them into a new collection. The script looked something like this:

$results = @()
for ($i=0; $i -lt $EventList.length ; $i++)
{
   $name = [string] $Events[$i].ProviderName
   $id = [long] $Events[$i].Id

   if ($name -ne "My-Provider-Name")
   {
      continue
   }

   if ($id -ne 3005) {

      continue
   }

   $results += $Events[$i]
}

This script indexed through the collection of events using the for statement, and then used the continue statement to skip to the next event if the current event didn’t match the desired criteria. If the event did match the criteria, it was appended to the result collection. Although this worked correctly, for large collections of events it was taking several minutes to execute. Let’s look at some ways to speed it up and make it smaller.

First, consider how you’re indexing through the collection. This requires a lot of index operations, variable retrievals and increments that aren’t the most efficient operations in an interpreted language like PowerShell. Instead, PowerShell has a number of constructs that let you iterate through a collection automatically. Given that the task is to select events where some condition is true, the Where-Object cmdlet is an obvious choice. The second optimization is how the result list is built. The original code manually adds each element to the result array. If you remember our discussion on how array catenation works, this means that the array has to be copied each time an element is added. The alternative approach, as we discussed, is to simply let the pipeline do the collection for you. With these design changes, the new script looks like

$BranchCache3005Events = $events | where {
    $_.Id -eq 3005 -and $_.ProviderName -eq "My-Provider-Name"}

The revised script is both hundreds of times faster and significantly shorter and clearer.

So, the rule for writing efficient PowerShell scripts is to let the system do the work for you. Use foreach instead of explicit indexing with for if you can. If you ever find yourself doing catenation in a loop to build up a string or collection, look at using the pipeline instead. You can also take advantage of the fact that all PowerShell statements return values so an even faster (but less obvious or simple) way to do this is to use the foreach statement:

$BranchCache3005Events = @( foreach ($e in $events) {
   if ($e.Id -eq 3005 -or
      $e.ProviderName -eq "Microsoft-Windows-BranchCacheSMB") {$e}} )

The key here is still letting the system implicitly build the result array instead of constructing it manually with +=. Likewise for string catenation, this

$s = -join $( foreach ($i in 1..40kb) { "a" } )

is faster than

$s = ""; foreach ($i in 1..40kb) { $s += "a" }

By following these guidelines, not only will your scripts be faster, they’ll also end up being shorter and frequently simpler and clearer (though not always.)

6.8. Summary

In chapter 6, we covered the branching and looping statements in the PowerShell language as summarized in the following list:

  • PowerShell allows you to use pipelines where other languages only allow expressions. This means that, although the PowerShell flow-control statements appear to be similar to the corresponding statements in other languages, enough differences exist to make it useful for you to spend time experimenting with them.
  • There are two ways of handling flow control in PowerShell. The first is to use the language flow-control statements such as while and foreach. But when performing pipelined operations, the alternative mechanism—the flow-control cmdlets ForEach-Object and Where-Object—can be more natural and efficient.
  • When iterating over collections, you should keep in mind the trade-offs between the foreach statement and the ForEach-Object cmdlet.
  • Any statement can be used as a value expression when nested in a subexpression. For example, you could use a while loop in a subexpression to generate a collection of values. In PowerShell v2, for simple assignments, the subexpression notation is no longer needed and the output of a statement can be assigned directly to a variable. This mechanism can be a concise way of generating a collection, but keep in mind the potential complexity that this kind of nested statement can introduce.
  • The PowerShell switch statement is a powerful tool. On the surface it looks like the switch statement in C# or the select statement in Visual Basic, but with powerful pattern matching capabilities, it goes well beyond what the statements in the other languages can do. And, along with the pattern matching, it can be used as a looping construct for selecting and processing objects from a collection or lines read from a file. In fact, much of its behavior was adapted from the AWK programming language.
  • The choice of statements and how you use them can have a significant effect on the performance of your scripts. This is something to keep in mind, but remember, only worry about performance if it becomes a problem. Otherwise, try to focus on making things as clear as possible.
sitemap

Unable to load book!

The book could not be loaded.

(try again in a couple of minutes)

manning.com homepage