Newer
Older
Starlark is a dialect of Python intended for use as a configuration
language. A Starlark interpreter is typically embedded within a larger
application, and this application may define additional
domain-specific functions and data types beyond those provided by the
core language. For example, Starlark is embedded within (and was
originally developed for) the [Bazel build tool](https://bazel.build),
and [Bazel's build language](https://docs.bazel.build/versions/master/starlark/language.html) is based on Starlark.
This document describes the Go implementation of Starlark
The language it defines is similar but not identical to
[the Java-based implementation](https://github.com/bazelbuild/bazel/blob/master/src/main/java/com/google/devtools/starlark/Starlark.java)
used by Bazel.
We identify places where their behaviors differ, and an
[appendix](#dialect-differences) provides a summary of those
differences.
We plan to converge both implementations on a single specification.
This document is maintained by Alan Donovan <adonovan@google.com>.
Copyright 1990–2017, Python Software Foundation,
and the Go specification, Copyright 2009–2017, The Go Authors.
Starlark was designed and implemented in Java by Laurent Le Brun,
Dmitry Lomov, Jon Brandvin, and Damien Martin-Guillerez, standing on
the shoulders of the Python community.
The Go implementation was written by Alan Donovan and Jay Conrod;
its scanner was derived from one written by Russ Cox.
## Overview
Starlark is an untyped dynamic language with high-level data types,
first-class functions with lexical scope, and automatic memory
management or _garbage collection_.
Starlark is strongly influenced by Python, and is almost a subset of
that language. In particular, its data types and syntax for
statements and expressions will be very familiar to any Python
programmer.
However, Starlark is intended not for writing applications but for
expressing configuration: its programs are short-lived and have no
external side effects and their main result is structured data or side
effects on the host application.
As a result, Starlark has no need for classes, exceptions, reflection,
concurrency, and other such features of Python.
Starlark execution is _deterministic_: all functions and operators
in the core language produce the same execution each time the program
is run; there are no sources of random numbers, clocks, or unspecified
iterators. This makes Starlark suitable for use in applications where
reproducibility is paramount, such as build tools.
## Contents
<!-- WTF? No automatic TOC? -->
* [Overview](#overview)
* [Contents](#contents)
* [Lexical elements](#lexical-elements)
* [Data types](#data-types)
* [None](#none)
* [Booleans](#booleans)
* [Integers](#integers)
* [Floating-point numbers](#floating-point-numbers)
* [Strings](#strings)
* [Lists](#lists)
* [Tuples](#tuples)
* [Dictionaries](#dictionaries)
* [Sets](#sets)
* [Functions](#functions)
* [Built-in functions](#built-in-functions)
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
* [Name binding and variables](#name-binding-and-variables)
* [Value concepts](#value-concepts)
* [Identity and mutation](#identity-and-mutation)
* [Freezing a value](#freezing-a-value)
* [Hashing](#hashing)
* [Sequence types](#sequence-types)
* [Indexing](#indexing)
* [Expressions](#expressions)
* [Identifiers](#identifiers)
* [Literals](#literals)
* [Parenthesized expressions](#parenthesized-expressions)
* [Dictionary expressions](#dictionary-expressions)
* [List expressions](#list-expressions)
* [Unary operators](#unary-operators)
* [Binary operators](#binary-operators)
* [Conditional expressions](#conditional-expressions)
* [Comprehensions](#comprehensions)
* [Function and method calls](#function-and-method-calls)
* [Dot expressions](#dot-expressions)
* [Index expressions](#index-expressions)
* [Slice expressions](#slice-expressions)
* [Lambda expressions](#lambda-expressions)
* [Statements](#statements)
* [Pass statements](#pass-statements)
* [Assignments](#assignments)
* [Augmented assignments](#augmented-assignments)
* [Function definitions](#function-definitions)
* [Return statements](#return-statements)
* [Expression statements](#expression-statements)
* [If statements](#if-statements)
* [For loops](#for-loops)
* [Break and Continue](#break-and-continue)
* [Load statements](#load-statements)
* [Module execution](#module-execution)
* [Built-in constants and functions](#built-in-constants-and-functions)
* [None](#none)
* [True and False](#true-and-false)
* [any](#any)
* [all](#all)
* [bool](#bool)
* [chr](#chr)
* [dict](#dict)
* [dir](#dir)
* [enumerate](#enumerate)
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
* [float](#float)
* [getattr](#getattr)
* [hasattr](#hasattr)
* [hash](#hash)
* [int](#int)
* [len](#len)
* [list](#list)
* [max](#max)
* [min](#min)
* [ord](#ord)
* [print](#print)
* [range](#range)
* [repr](#repr)
* [reversed](#reversed)
* [set](#set)
* [sorted](#sorted)
* [str](#str)
* [tuple](#tuple)
* [type](#type)
* [zip](#zip)
* [Built-in methods](#built-in-methods)
* [dict·clear](#dict·clear)
* [dict·get](#dict·get)
* [dict·items](#dict·items)
* [dict·keys](#dict·keys)
* [dict·pop](#dict·pop)
* [dict·popitem](#dict·popitem)
* [dict·setdefault](#dict·setdefault)
* [dict·update](#dict·update)
* [dict·values](#dict·values)
* [list·append](#list·append)
* [list·clear](#list·clear)
* [list·extend](#list·extend)
* [list·index](#list·index)
* [list·insert](#list·insert)
* [list·pop](#list·pop)
* [list·remove](#list·remove)
* [set·union](#set·union)
* [string·capitalize](#string·capitalize)
* [string·codepoint_ords](#string·codepoint_ords)
* [string·codepoints](#string·codepoints)
* [string·count](#string·count)
* [string·elem_ords](#string·elem_ords)
* [string·elems](#string·elems)
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
* [string·endswith](#string·endswith)
* [string·find](#string·find)
* [string·format](#string·format)
* [string·index](#string·index)
* [string·isalnum](#string·isalnum)
* [string·isalpha](#string·isalpha)
* [string·isdigit](#string·isdigit)
* [string·islower](#string·islower)
* [string·isspace](#string·isspace)
* [string·istitle](#string·istitle)
* [string·isupper](#string·isupper)
* [string·join](#string·join)
* [string·lower](#string·lower)
* [string·lstrip](#string·lstrip)
* [string·partition](#string·partition)
* [string·replace](#string·replace)
* [string·rfind](#string·rfind)
* [string·rindex](#string·rindex)
* [string·rpartition](#string·rpartition)
* [string·rsplit](#string·rsplit)
* [string·rstrip](#string·rstrip)
* [string·split](#string·split)
* [string·splitlines](#string·splitlines)
* [string·startswith](#string·startswith)
* [string·strip](#string·strip)
* [string·title](#string·title)
* [string·upper](#string·upper)
* [Dialect differences](#dialect-differences)
## Lexical elements
A Starlark program consists of one or more modules.
Each module is defined by a single UTF-8-encoded text file.
A complete grammar of Starlark can be found in [grammar.txt](../syntax/grammar.txt).
That grammar is presented piecemeal throughout this document
in boxes such as this one, which explains the notation:
```grammar {.good}
Grammar notation
- lowercase and 'quoted' items are lexical tokens.
- Capitalized names denote grammar productions.
- (...) implies grouping.
- x | y means either x or y.
- [x] means x is optional.
- {x} means x is repeated zero or more times.
- The end of each declaration is marked with a period.
```
The contents of a Starlark file are broken into a sequence of tokens of
five kinds: white space, punctuation, keywords, identifiers, and literals.
Each token is formed from the longest sequence of characters that
would form a valid token of each kind.
```grammar {.good}
File = {Statement | newline} eof .
```
*White space* consists of spaces (U+0020), tabs (U+0009), carriage
returns (U+000D), and newlines (U+000A). Within a line, white space
has no effect other than to delimit the previous token, but newlines,
and spaces at the start of a line, are significant tokens.
*Comments*: A hash character (`#`) appearing outside of a string
literal marks the start of a comment; the comment extends to the end
of the line, not including the newline character.
Comments are treated like other white space.
*Punctuation*: The following punctuation characters or sequences of
characters are tokens:
```text
+ - * / // % =
+= -= *= /= //= %= == !=
^ < > << >> & |
^= <= >= <<= >>= &= |=
. , ; : ~ **
( ) [ ] { }
```
*Keywords*: The following tokens are keywords and may not be used as
identifiers:
```text
and elif in or
break else lambda pass
continue for load return
def if not while
```
The tokens below also may not be used as identifiers although they do not
appear in the grammar; they are reserved as possible future keywords:
<!-- and to remain a syntactic subset of Python -->
```text
as finally nonlocal
assert from raise
class global try
del import with
except is yield
```
<b>Implementation note:</b>
The Go implementation permits `assert` to be used as an identifier,
and this feature is widely used in its tests.
*Identifiers*: an identifier is a sequence of Unicode letters, decimal
digits, and underscores (`_`), not starting with a digit.
Identifiers are used as names for values.
Examples:
```text
None True len
x index starts_with arg0
```
*Literals*: literals are tokens that denote specific values. Starlark
has string, integer, and floating-point literals.
```text
0 # int
123 # decimal int
0x7f # hexadecimal int
0b1011 # binary int
0.0 0. .0 # float
1e10 1e+10 1e-10
1.1e10 1.1e+10 1.1e-10
"hello" 'hello' # string
'''hello''' """hello""" # triple-quoted string
r'hello' r"hello" # raw string literal
```
Integer and floating-point literal tokens are defined by the following grammar:
```grammar {.good}
int = decimal_lit | octal_lit | hex_lit | binary_lit .
decimal_lit = ('1' … '9') {decimal_digit} | '0' .
octal_lit = '0' ('o'|'O') octal_digit {octal_digit} .
hex_lit = '0' ('x'|'X') hex_digit {hex_digit} .
binary_lit = '0' ('b'|'B') binary_digit {binary_digit} .
float = decimals '.' [decimals] [exponent]
| decimals exponent
| '.' decimals [exponent]
.
decimals = decimal_digit {decimal_digit} .
exponent = ('e'|'E') ['+'|'-'] decimals .
decimal_digit = '0' … '9' .
octal_digit = '0' … '7' .
hex_digit = '0' … '9' | 'A' … 'F' | 'a' … 'f' .
binary_digit = '0' | '1' .
### String literals
A Starlark string literal denotes a string value.
In its simplest form, it consists of the desired text
surrounded by matching single- or double-quotation marks:
```python
"abc"
'abc'
```
Literal occurrences of the chosen quotation mark character must be
escaped by a preceding backslash. So, if a string contains several
of one kind of quotation mark, it may be convenient to quote the string
using the other kind, as in these examples:
```python
'Have you read "To Kill a Mockingbird?"'
"Yes, it's a classic."
"Have you read \"To Kill a Mockingbird?\""
'Yes, it\'s a classic.'
```
Literal occurrences of the _opposite_ kind of quotation mark, such as
an apostrophe within a double-quoted string literal, may be escaped
by a backslash, but this is not necessary: `"it's"` and `"it\'s"` are
equivalent.
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
#### String escapes
Within a string literal, the backslash character `\` indicates the
start of an _escape sequence_, a notation for expressing things that
are impossible or awkward to write directly.
The following *traditional escape sequences* represent the ASCII control
codes 7-13:
```
\a \x07 alert or bell
\b \x08 backspace
\f \x0C form feed
\n \x0A line feed
\r \x0D carriage return
\t \x09 horizontal tab
\v \x0B vertical tab
```
A *literal backslash* is written using the escape `\\`.
An *escaped newline*---that is, a backslash at the end of a line---is ignored,
allowing a long string to be split across multiple lines of the source file.
```python
"abc\
def" # "abcdef"
```
An *octal escape* encodes a single byte using its octal value.
It consists of a backslash followed by one, two, or three octal digits [0-7].
It is error if the value is greater than decimal 255.
```python
'\0' # "\x00" a string containing a single NUL byte
'\12' # "\n" octal 12 = decimal 10
'\101-\132' # "A-Z"
'\119' # "\t9" = "\11" + "9"
```
<b>Implementation note:</b>
The Java implementation encodes strings using UTF-16,
so an octal escape encodes a single UTF-16 code unit.
Octal escapes for values above 127 are therefore not portable across implementations.
There is little reason to use octal escapes in new code.
A *hex escape* encodes a single byte using its hexadecimal value.
It consists of `\x` followed by exactly two hexadecimal digits [0-9A-Fa-f].
```python
"\x00" # "\x00" a string containing a single NUL byte
"(\x20)" # "( )" ASCII 0x20 = 32 = space
red, reset = "\x1b[31m", "\x1b[0m" # ANSI terminal control codes for color
"(" + red + "hello" + reset + ")" # "(hello)" with red text, if on a terminal
```
<b>Implementation note:</b>
The Java implementation does not support hex escapes.
An ordinary string literal may not contain an unescaped newline,
but a *multiline string literal* may spread over multiple source lines.
It is denoted using three quotation marks at start and end.
Within it, unescaped newlines and quotation marks (or even pairs of
quotation marks) have their literal meaning, but three quotation marks
end the literal. This makes it easy to quote large blocks of text with
few escapes.
```
haiku = '''
Yesterday it worked.
Today it is not working.
That's computers. Sigh.
'''
```
Regardless of the platform's convention for text line endings---for
example, a linefeed (\n) on UNIX, or a carriage return followed by a
linefeed (\r\n) on Microsoft Windows---an unescaped line ending in a
multiline string literal always denotes a line feed (\n).
Starlark also supports *raw string literals*, which look like an
ordinary single- or double-quotation preceded by `r`. Within a raw
string literal, there is no special processing of backslash escapes,
other than an escaped quotation mark (which denotes a literal
quotation mark), or an escaped newline (which denotes a backslash
followed by a newline). This form of quotation is typically used when
writing strings that contain many quotation marks or backslashes (such
as regular expressions or shell commands) to reduce the burden of
escaping:
```python
"a\nb" # "a\nb" = 'a' + '\n' + 'b'
r"a\nb" # "a\\nb" = 'a' + '\\' + 'n' + 'b'
"a\
b" # "ab"
r"a\
b" # "a\\\nb"
```
It is an error for a backslash to appear within a string literal other
than as part of one of the escapes described above.
TODO: define indent, outdent, semicolon, newline, eof
## Data types
These are the main data types built in to the interpreter:
NoneType # the type of None
bool # True or False
int # a signed integer of arbitrary magnitude
float # an IEEE 754 double-precision floating point number
string # a byte string
list # a modifiable sequence of values
tuple # an unmodifiable sequence of values
dict # a mapping from values to values
set # a set of values
builtin_function_or_method # a function or method implemented by the interpreter or host application
Some functions, such as the iteration methods of `string`, or the
`range` function, return instances of special-purpose types that don't
appear in this list.
Additional data types may be defined by the host application into
which the interpreter is embedded, and those data types may
participate in basic operations of the language such as arithmetic,
comparison, indexing, and function calls.
<!-- We needn't mention the stringIterable type here. -->
Some operations can be applied to any Starlark value. For example,
every value has a type string that can be obtained with the expression
`type(x)`, and any value may be converted to a string using the
expression `str(x)`, or to a Boolean truth value using the expression
`bool(x)`. Other operations apply only to certain types. For
example, the indexing operation `a[i]` works only with strings, lists,
and tuples, and any application-defined types that are _indexable_.
The [_value concepts_](#value-concepts) section explains the groupings of
types by the operators they support.
### None
`None` is a distinguished value used to indicate the absence of any other value.
For example, the result of a call to a function that contains no return statement is `None`.
`None` is equal only to itself. Its [type](#type) is `"NoneType"`.
The truth value of `None` is `False`.
### Booleans
There are two Boolean values, `True` and `False`, representing the
truth or falsehood of a predicate. The [type](#type) of a Boolean is `"bool"`.
Boolean values are typically used as conditions in `if`-statements,
although any Starlark value used as a condition is implicitly
interpreted as a Boolean.
For example, the values `None`, `0`, `0.0`, and the empty sequences
`""`, `()`, `[]`, and `{}` have a truth value of `False`, whereas non-zero
numbers and non-empty sequences have a truth value of `True`.
Application-defined types determine their own truth value.
Any value may be explicitly converted to a Boolean using the built-in `bool`
function.
```python
1 + 1 == 2 # True
2 + 2 == 5 # False
if 1 + 1:
print("True")
else:
print("False")
```
### Integers
The Starlark integer type represents integers. Its [type](#type) is `"int"`.
Integers may be positive or negative, and arbitrarily large.
Integer arithmetic is exact.
Integers are totally ordered; comparisons follow mathematical
tradition.
The `+` and `-` operators perform addition and subtraction, respectively.
The `*` operator performs multiplication.
The `//` and `%` operations on integers compute floored division and
remainder of floored division, respectively.
If the signs of the operands differ, the sign of the remainder `x % y`
matches that of the divisor, `y`.
For all finite x and y (y ≠ 0), `(x // y) * y + (x % y) == x`.
The `/` operator implements real division, and
yields a `float` result even when its operands are both of type `int`.
Integers, including negative values, may be interpreted as bit vectors.
The `|`, `&`, and `^` operators implement bitwise OR, AND, and XOR,
respectively. The unary `~` operator yields the bitwise inversion of its
integer argument. The `<<` and `>>` operators shift the first argument
to the left or right by the number of bits given by the second argument.
Any bool, number, or string may be interpreted as an integer by using
the `int` built-in function.
An integer used in a Boolean context is considered true if it is
non-zero.
```python
100 // 5 * 9 + 32 # 212
3 // 2 # 1
3 / 2 # 1.5
111111111 * 111111111 # 12345678987654321
"0x%x" % (0x1234 & 0xf00f) # "0x1004"
int("ffff", 16) # 65535, 0xffff
```
### Floating-point numbers
The Starlark floating-point data type represents an IEEE 754
double-precision floating-point number. Its [type](#type) is `"float"`.
Arithmetic on floats using the `+`, `-`, `*`, `/`, `//`, and `%`
operators follows the IEE 754 standard.
However, computing the division or remainder of division by zero is a dynamic error.
An arithmetic operation applied to a mixture of `float` and `int`
operands works as if the `int` operand is first converted to a
`float`. For example, `3.141 + 1` is equivalent to `3.141 +
float(1)`.
There are two floating-point division operators:
`x / y ` yields the floating-point quotient of `x` and `y`,
whereas `x // y` yields `floor(x / y)`, that is, the largest
integer value not greater than `x / y`.
Although the resulting number is integral, it is represented as a
`float` if either operand is a `float`.
The `%` operation computes the remainder of floored division.
As with the corresponding operation on integers,
if the signs of the operands differ, the sign of the remainder `x % y`
matches that of the divisor, `y`.
The infinite float values `+Inf` and `-Inf` represent numbers
greater/less than all finite float values.
The non-finite `NaN` value represents the result of dubious operations
such as `Inf/Inf`. A NaN value compares neither less than, nor
greater than, nor equal to any value, including itself.
All floats other than NaN are totally ordered, so they may be compared
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
Any bool, number, or string may be interpreted as a floating-point
number by using the `float` built-in function.
A float used in a Boolean context is considered true if it is
non-zero.
```python
1.23e45 * 1.23e45 # 1.5129e+90
1.111111111111111 * 1.111111111111111 # 1.23457
3.0 / 2 # 1.5
3 / 2.0 # 1.5
float(3) / 2 # 1.5
3.0 // 2.0 # 1
```
### Strings
A string represents an immutable sequence of bytes.
The [type](#type) of a string is `"string"`.
Strings can represent arbitrary binary data, including zero bytes, but
most strings contain text, encoded by convention using UTF-8.
The built-in `len` function returns the number of bytes in a string.
Strings may be concatenated with the `+` operator.
The substring expression `s[i:j]` returns the substring of `s` from
index `i` up to index `j`. The index expression `s[i]` returns the
1-byte substring `s[i:i+1]`.
Strings are hashable, and thus may be used as keys in a dictionary.
Strings are totally ordered lexicographically, so strings may be
compared using operators such as `==` and `<`.
Strings are _not_ iterable sequences, so they cannot be used as the operand of
a `for`-loop, list comprehension, or any other operation than requires
an iterable sequence.
To obtain a view of a string as an iterable sequence of numeric byte
values, 1-byte substrings, numeric Unicode code points, or 1-code
point substrings, you must explicitly call one of its four methods:
`elems`, `elem_ords`, `codepoints`, or `codepoint_ords`.
Any value may formatted as a string using the `str` or `repr` built-in
functions, the `str % tuple` operator, or the `str.format` method.
A string used in a Boolean context is considered true if it is
non-empty.
Strings have several built-in methods:
* [`capitalize`](#string·capitalize)
* [`codepoint_ords`](#string·codepoint_ords)
* [`codepoints`](#string·codepoints)
* [`count`](#string·count)
* [`elem_ords`](#string·elem_ords)
* [`elems`](#string·elems)
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
* [`endswith`](#string·endswith)
* [`find`](#string·find)
* [`format`](#string·format)
* [`index`](#string·index)
* [`isalnum`](#string·isalnum)
* [`isalpha`](#string·isalpha)
* [`isdigit`](#string·isdigit)
* [`islower`](#string·islower)
* [`isspace`](#string·isspace)
* [`istitle`](#string·istitle)
* [`isupper`](#string·isupper)
* [`join`](#string·join)
* [`lower`](#string·lower)
* [`lstrip`](#string·lstrip)
* [`partition`](#string·partition)
* [`replace`](#string·replace)
* [`rfind`](#string·rfind)
* [`rindex`](#string·rindex)
* [`rpartition`](#string·rpartition)
* [`rsplit`](#string·rsplit)
* [`rstrip`](#string·rstrip)
* [`split`](#string·split)
* [`splitlines`](#string·splitlines)
* [`startswith`](#string·startswith)
* [`strip`](#string·strip)
* [`title`](#string·title)
* [`upper`](#string·upper)
<b>Implementation note:</b>
The type of a string element varies across implementations.
There is agreement that byte strings, with text conventionally encoded
using UTF-8, is the ideal choice, but the Java implementation treats
strings as sequences of UTF-16 codes and changing it appears
intractible; see Google Issue b/36360490.
<b>Implementation note:</b>
The Java implementation does not consistently treat strings as
iterable; see `testdata/string.star` in the test suite and Google Issue
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
b/34385336 for further details.
### Lists
A list is a mutable sequence of values.
The [type](#type) of a list is `"list"`.
Lists are indexable sequences: the elements of a list may be iterated
over by `for`-loops, list comprehensions, and various built-in
functions.
List may be constructed using bracketed list notation:
```python
[] # an empty list
[1] # a 1-element list
[1, 2] # a 2-element list
```
Lists can also be constructed from any iterable sequence by using the
built-in `list` function.
The built-in `len` function applied to a list returns the number of elements.
The index expression `list[i]` returns the element at index i,
and the slice expression `list[i:j]` returns a new list consisting of
the elements at indices from i to j.
List elements may be added using the `append` or `extend` methods,
removed using the `remove` method, or reordered by assignments such as
`list[i] = list[j]`.
The concatenation operation `x + y` yields a new list containing all
the elements of the two lists x and y.
For most types, `x += y` is equivalent to `x = x + y`, except that it
evaluates `x` only once, that is, it allocates a new list to hold
the concatenation of `x` and `y`.
However, if `x` refers to a list, the statement does not allocate a
new list but instead mutates the original list in place, similar to
`x.extend(y)`.
Lists are not hashable, so may not be used in the keys of a dictionary.
A list used in a Boolean context is considered true if it is
non-empty.
A [_list comprehension_](#comprehensions) creates a new list whose elements are the
result of some expression applied to each element of another sequence.
```python
[x*x for x in [1, 2, 3, 4]] # [1, 4, 9, 16]
```
A list value has these methods:
* [`append`](#list·append)
* [`clear`](#list·clear)
* [`extend`](#list·extend)
* [`index`](#list·index)
* [`insert`](#list·insert)
* [`pop`](#list·pop)
* [`remove`](#list·remove)
### Tuples
A tuple is an immutable sequence of values.
The [type](#type) of a tuple is `"tuple"`.
Tuples are constructed using parenthesized list notation:
```python
() # the empty tuple
(1,) # a 1-tuple
(1, 2) # a 2-tuple ("pair")
(1, 2, 3) # a 3-tuple
```
Observe that for the 1-tuple, the trailing comma is necessary to
distinguish it from the parenthesized expression `(1)`.
1-tuples are seldom used.
Starlark, unlike Python, does not permit a trailing comma to appear in
an unparenthesized tuple expression:
```python
for k, v, in dict.items(): pass # syntax error at 'in'
_ = [(v, k) for k, v, in dict.items()] # syntax error at 'in'
f = lambda a, b, : None # syntax error at ':'
sorted(3, 1, 4, 1,) # ok
[1, 2, 3, ] # ok
{1: 2, 3:4, } # ok
```
Any iterable sequence may be converted to a tuple by using the
built-in `tuple` function.
Like lists, tuples are indexed sequences, so they may be indexed and
sliced. The index expression `tuple[i]` returns the tuple element at
index i, and the slice expression `tuple[i:j]` returns a sub-sequence
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
of a tuple.
Tuples are iterable sequences, so they may be used as the operand of a
`for`-loop, a list comprehension, or various built-in functions.
Unlike lists, tuples cannot be modified.
However, the mutable elements of a tuple may be modified.
Tuples are hashable (assuming their elements are hashable),
so they may be used as keys of a dictionary.
Tuples may be concatenated using the `+` operator.
A tuple used in a Boolean context is considered true if it is
non-empty.
### Dictionaries
A dictionary is a mutable mapping from keys to values.
The [type](#type) of a dictionary is `"dict"`.
Dictionaries provide constant-time operations to insert an element, to
look up the value for a key, or to remove an element. Dictionaries
are implemented using hash tables, so keys must be hashable. Hashable
values include `None`, Booleans, numbers, and strings, and tuples
composed from hashable values. Most mutable values, such as lists,
dictionaries, and sets, are not hashable, even when frozen.
Attempting to use a non-hashable value as a key in a dictionary
results in a dynamic error.
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
A [dictionary expression](#dictionary-expressions) specifies a
dictionary as a set of key/value pairs enclosed in braces:
```python
coins = {
"penny": 1,
"nickel": 5,
"dime": 10,
"quarter": 25,
}
```
The expression `d[k]`, where `d` is a dictionary and `k` is a key,
retrieves the value associated with the key. If the dictionary
contains no such item, the operation fails:
```python
coins["penny"] # 1
coins["dime"] # 10
coins["silver dollar"] # error: key not found
```
The number of items in a dictionary `d` is given by `len(d)`.
A key/value item may be added to a dictionary, or updated if the key
is already present, by using `d[k]` on the left side of an assignment:
```python
len(coins) # 4
coins["shilling"] = 20
len(coins) # 5, item was inserted
coins["shilling"] = 5
len(coins) # 5, existing item was updated
```
A dictionary can also be constructed using a [dictionary
comprehension](#comprehension), which evaluates a pair of expressions,
the _key_ and the _value_, for every element of another iterable such
as a list. This example builds a mapping from each word to its length
in bytes:
```python
words = ["able", "baker", "charlie"]
{x: len(x) for x in words} # {"charlie": 7, "baker": 5, "able": 4}
```
Dictionaries are iterable sequences, so they may be used as the
operand of a `for`-loop, a list comprehension, or various built-in
functions.
Iteration yields the dictionary's keys in the order in which they were
inserted; updating the value associated with an existing key does not
affect the iteration order.
```python
x = dict([("a", 1), ("b", 2)]) # {"a": 1, "b": 2}
x.update([("a", 3), ("c", 4)]) # {"a": 3, "b": 2, "c": 4}
```
```python
for name in coins:
print(name, coins[name]) # prints "quarter 25", "dime 10", ...
```
Like all mutable values in Starlark, a dictionary can be frozen, and
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
once frozen, all subsequent operations that attempt to update it will
fail.
A dictionary used in a Boolean context is considered true if it is
non-empty.
Dictionaries may be compared for equality using `==` and `!=`. Two
dictionaries compare equal if they contain the same number of items
and each key/value item (k, v) found in one dictionary is also present
in the other. Dictionaries are not ordered; it is an error to compare
two dictionaries with `<`.
A dictionary value has these methods:
* [`clear`](#dict·clear)
* [`get`](#dict·get)
* [`items`](#dict·items)
* [`keys`](#dict·keys)
* [`pop`](#dict·pop)
* [`popitem`](#dict·popitem)
* [`setdefault`](#dict·setdefault)
* [`update`](#dict·update)
* [`values`](#dict·values)
### Sets
A set is a mutable set of values.
The [type](#type) of a set is `"set"`.
Like dictionaries, sets are implemented using hash tables, so the
elements of a set must be hashable.
Sets may be compared for equality or inequality using `==` and `!=`.
Two sets compare equal if they contain the same elements.
Sets are iterable sequences, so they may be used as the operand of a
`for`-loop, a list comprehension, or various built-in functions.
Iteration yields the set's elements in the order in which they were
inserted.
The binary `|` and `&` operators compute union and intersection when
applied to sets. The right operand of the `|` operator may be any
iterable value. The binary `in` operator performs a set membership
test when its right operand is a set.
The binary `^` operator performs symmetric difference of two sets.
Sets are instantiated by calling the built-in `set` function, which
returns a set containing all the elements of its optional argument,
which must be an iterable sequence. Sets have no literal syntax.
The only method of a set is `union`, which is equivalent to the `|` operator.
A set used in a Boolean context is considered true if it is non-empty.
<b>Implementation note:</b>
The Go implementation of Starlark requires the `-set` flag to
enable support for sets.
The Java implementation does not support sets.
### Functions
A function value represents a function defined in Starlark.
Its [type](#type) is `"function"`.
A function value used in a Boolean context is always considered true.
Functions defined by a [`def` statement](#function-definitions) are named;
functions defined by a [`lambda` expression](#lambda-expressions) are anonymous.
Function definitions may be nested, and an inner function may refer to a local variable of an outer function.
A function definition defines zero or more named parameters.
Starlark has a rich mechanism for passing arguments to functions.
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!-- TODO break up this explanation into caller-side and callee-side
parts, and put the former under function calls and the latter
under function definitions. Also try to convey that the Callable
interface sees the flattened-out args and kwargs and that's what
built-ins get.
-->
The example below shows a definition and call of a function of two
required parameters, `x` and `y`.
```python
def idiv(x, y):
return x // y
idiv(6, 3) # 2
```
A call may provide arguments to function parameters either by
position, as in the example above, or by name, as in first two calls
below, or by a mixture of the two forms, as in the third call below.
All the positional arguments must precede all the named arguments.
Named arguments may improve clarity, especially in functions of
several parameters.
```python
idiv(x=6, y=3) # 2