# For developers: AST Nodes and Transformations¶

## AST Nodes¶

class Node(parent=None)

Base class for all AST nodes.

property args

Returns all arguments/children of this node.

Return type
property symbols_defined

Set of symbols which are defined by this node.

Return type
property undefined_symbols

Symbols which are used but are not defined inside this node.

Return type
subs(subs_dict)

Inplace! Substitute, similar to sympy’s but modifies the AST inplace.

Return type

None

atoms(arg_type)

Returns a set of all descendants recursively, which are an instance of the given type.

Return type
class Conditional(condition_expr, true_block, false_block=None)

Conditional that maps to a ‘if’ statement in C/C++.

Try to avoid using this node inside of loops, since currently this construction can not be vectorized. Consider using assignments with sympy.Piecewise in this case.

Parameters
subs(subs_dict)

Inplace! Substitute, similar to sympy’s but modifies the AST inplace.

property args

Returns all arguments/children of this node.

property symbols_defined

Set of symbols which are defined by this node.

property undefined_symbols

Symbols which are used but are not defined inside this node.

replace_by_true_block()

Replaces the conditional by its True block

replace_by_false_block()

Replaces the conditional by its False block

class KernelFunction(body, target, backend, compile_function, ghost_layers, function_name='kernel')
class Parameter(symbol, fields)

Function parameter.

Each undefined symbol in a KernelFunction node becomes a parameter to the function. Parameters are either symbols introduced by the user that never occur on the left hand side of an Assignment, or are related to fields/arrays passed to the function.

A parameter consists of the typed symbol (symbol property). For field related parameters this is a symbol defined in pystencils.kernelparameters. If the parameter is related to one or multiple fields, these fields are referenced in the fields property.

property target

Currently either ‘cpu’ or ‘gpu’

property backend

Backend for generating the code e.g. ‘llvm’, ‘c’, ‘cuda’

property symbols_defined

Set of symbols which are defined by this node.

property undefined_symbols

Symbols which are used but are not defined inside this node.

property args

Returns all arguments/children of this node.

property fields_accessed

fields which are accessed inside this kernel function

Type

Set of Field instances

Return type
get_parameters()

Returns list of parameters for this function.

This function is expensive, cache the result where possible!

Return type
class SkipIteration(parent=None)
property args

Returns all arguments/children of this node.

property symbols_defined

Set of symbols which are defined by this node.

property undefined_symbols

Symbols which are used but are not defined inside this node.

class Block(nodes)
property args

Returns all arguments/children of this node.

subs(subs_dict)

Inplace! Substitute, similar to sympy’s but modifies the AST inplace.

Return type

None

property symbols_defined

Set of symbols which are defined by this node.

property undefined_symbols

Symbols which are used but are not defined inside this node.

class PragmaBlock(pragma_line, nodes)
class LoopOverCoordinate(body, coordinate_to_loop_over, start, stop, step=1, is_block_loop=False)
subs(subs_dict)

Inplace! Substitute, similar to sympy’s but modifies the AST inplace.

property args

Returns all arguments/children of this node.

property symbols_defined

Set of symbols which are defined by this node.

property undefined_symbols

Symbols which are used but are not defined inside this node.

class SympyAssignment(lhs_symbol, rhs_expr, is_const=True, use_auto=False)
subs(subs_dict)

Inplace! Substitute, similar to sympy’s but modifies the AST inplace.

property args

Returns all arguments/children of this node.

property symbols_defined

Set of symbols which are defined by this node.

property undefined_symbols

Symbols which are used but are not defined inside this node.

class ResolvedFieldAccess
class TemporaryMemoryAllocation(typed_symbol, size, align_offset)

Node for temporary memory buffer allocation.

Always allocates aligned memory.

Parameters
• typed_symbol (TypedSymbol) – symbol used as pointer (has to be typed)

• size – number of elements to allocate

• align_offset – the align_offset’s element is aligned

property symbols_defined

Set of symbols which are defined by this node.

property undefined_symbols

Symbols which are used but are not defined inside this node.

property args

Returns all arguments/children of this node.

offset(byte_alignment)

Number of ELEMENTS to skip for a pointer that is aligned to byte_alignment.

class TemporaryMemoryFree(alloc_node)
property symbols_defined

Set of symbols which are defined by this node.

property undefined_symbols

Symbols which are used but are not defined inside this node.

property args

Returns all arguments/children of this node.

class SourceCodeComment(text)
property args

Returns all arguments/children of this node.

property symbols_defined

Set of symbols which are defined by this node.

property undefined_symbols

Symbols which are used but are not defined inside this node.

class EmptyLine
property args

Returns all arguments/children of this node.

property symbols_defined

Set of symbols which are defined by this node.

property undefined_symbols

Symbols which are used but are not defined inside this node.

class ConditionalFieldAccess

pystencils.Field.Access that is only executed if a certain condition is met. Can be used, for instance, for out-of-bound checks.

## Transformations¶

class NestedScopes

Symbol visibility model using nested scopes

• every accessed symbol that was not defined before, is added as a “free parameter”

• free parameters are global, i.e. they are not in scopes

• push/pop adds or removes a scope

>>> s = NestedScopes()
>>> s.access_symbol("a")
>>> s.is_defined("a")
False
>>> s.free_parameters
{'a'}
>>> s.define_symbol("b")
>>> s.is_defined("b")
True
>>> s.push()
>>> s.is_defined_locally("b")
False
>>> s.define_symbol("c")
>>> s.pop()
>>> s.is_defined("c")
False

unify_shape_symbols(body, common_shape, fields)

Replaces symbols for array sizes to ensure they are represented by the same unique symbol.

When creating a kernel with variable array sizes, all passed arrays must have the same size. This is ensured when the kernel is called. Inside the kernel this means that only on symbol has to be used instead of one for each field. For example shape_arr1[0] and shape_arr2[0] must be equal, so they should also be represented by the same symbol.

Parameters
• body – ast node, for the kernel part where substitutions is made, is modified in-place

• common_shape – shape of the field that was chosen

• fields – all fields whose shapes should be replaced by common_shape

get_common_shape(field_set)

Takes a set of pystencils Fields and returns their common spatial shape if it exists. Otherwise ValueError is raised

make_loop_over_domain(body, iteration_slice=None, ghost_layers=None, loop_order=None)

Uses pystencils.field.Field.Access to create (multiple) loops around given AST.

Parameters
• body – Block object with inner loop contents

• iteration_slice – if not None, iteration is done only over this slice of the field

• ghost_layers – a sequence of pairs for each coordinate with lower and upper nr of ghost layers if None, the number of ghost layers is determined automatically and assumed to be equal for a all dimensions

• loop_order – loop ordering from outer to inner loop (optimal ordering is same as layout)

Returns

tuple of loop-node, ghost_layer_info

create_intermediate_base_pointer(field_access, coordinates, previous_ptr)

Addressing elements in structured arrays is done with $$ptr\left[ \sum_i c_i \cdot s_i \right]$$ where $$c_i$$ is the coordinate value and $$s_i$$ the stride of a coordinate. The sum can be split up into multiple parts, such that parts of it can be pulled before loops. This function creates such an access for coordinates $$i \in \mbox{coordinates}$$. Returns a new typed symbol, where the name encodes which coordinates have been resolved.

Parameters
• field_access – instance of pystencils.field.Field.Access which provides strides and offsets

• coordinates – mapping of coordinate ids to its value, where stride*value is calculated

• previous_ptr – the pointer which is de-referenced

Returns

tuple with the new pointer symbol and the calculated offset

Examples

>>> field = Field.create_generic('myfield', spatial_dimensions=2, index_dimensions=1)
>>> x, y = sp.symbols("x y")
>>> prev_pointer = TypedSymbol("ptr", "double")
>>> create_intermediate_base_pointer(field[1,-2](5), {0: x}, prev_pointer)
(ptr_01, _stride_myfield_0*x + _stride_myfield_0)
>>> create_intermediate_base_pointer(field[1,-2](5), {0: x, 1 : y }, prev_pointer)
(ptr_01_1m2, _stride_myfield_0*x + _stride_myfield_0 + _stride_myfield_1*y - 2*_stride_myfield_1)

parse_base_pointer_info(base_pointer_specification, loop_order, spatial_dimensions, index_dimensions)

Creates base pointer specification for resolve_field_accesses() function.

Specification of how many and which intermediate pointers are created for a field access. For example [ (0), (2,3,)] creates on base pointer for coordinates 2 and 3 and writes the offset for coordinate zero directly in the field access. These specifications are defined dependent on the loop ordering. This function translates more readable version into the specification above.

Allowed specifications:
• “spatialInner<int>” spatialInner0 is the innermost loop coordinate, spatialInner1 the loop enclosing the innermost

• “spatialOuter<int>” spatialOuter0 is the outermost loop

• “index<int>”: index coordinate

• “<int>”: specifying directly the coordinate

Parameters
• base_pointer_specification – nested list with above specifications

• loop_order – list with ordering of loops from outer to inner

• spatial_dimensions – number of spatial dimensions

• index_dimensions – number of index dimensions

Returns

list of tuples that can be passed to resolve_field_accesses()

Examples

>>> parse_base_pointer_info([['spatialOuter0'], ['index0']], loop_order=[2,1,0],
...                         spatial_dimensions=3, index_dimensions=1)
[[0], [3], [1, 2]]

get_base_buffer_index(ast_node, loop_counters=None, loop_iterations=None)

Used for buffer fields to determine the linearized index of the buffer dependent on loop counter symbols.

Parameters
• ast_node – ast before any field accesses are resolved

• loop_counters – for CPU kernels: leave to default ‘None’ (can be determined from loop nodes) for GPU kernels: list of ‘loop counters’ from inner to outer loop

• loop_iterations – number of iterations of each loop from inner to outer, for CPU kernels leave to default

Returns

base buffer index - required by ‘resolve_buffer_accesses’ function

resolve_field_accesses(ast_node, read_only_field_names={}, field_to_base_pointer_info=mappingproxy({}), field_to_fixed_coordinates=mappingproxy({}))

Substitutes pystencils.field.Field.Access nodes by array indexing

Parameters
• ast_node – the AST root

• field_to_base_pointer_info – a list of tuples indicating which intermediate base pointers should be created for details see parse_base_pointer_info()

• field_to_fixed_coordinates – map of field name to a tuple of coordinate symbols. Instead of using the loop counters to index the field these symbols are used as coordinates

Returns

transformed AST

move_constants_before_loop(ast_node)

Moves pystencils.ast.SympyAssignment nodes out of loop body if they are iteration independent.

Call this after creating the loop structure with make_loop_over_domain()

split_inner_loop(ast_node, symbol_groups)

Splits inner loop into multiple loops to minimize the amount of simultaneous load/store streams

Parameters
• ast_node (Node) – AST root

• symbol_groups – sequence of symbol sequences: for each symbol sequence a new inner loop is created which updates these symbols and their dependent symbols. Symbols which are in none of the symbolGroups and which no symbol in a symbol group depends on, are not updated!

cut_loop(loop_node, cutting_points)

Cuts loop at given cutting points.

One loop is transformed into len(cuttingPoints)+1 new loops that range from old_begin to cutting_points[1], …, cutting_points[-1] to old_end

Modifies the ast in place

Returns

list of new loop nodes

simplify_conditionals(node, loop_counter_simplification=False)

Removes conditionals that are always true/false.

Parameters
• node (Node) – ast node, all descendants of this node are simplified

• loop_counter_simplification (bool) – if enabled, tries to detect if a conditional is always true/false depending on the surrounding loop. For example if the surrounding loop goes from x=0 to 10 and the condition is x < 0, it is removed. This analysis needs the integer set library (ISL) islpy, so it is not done by default.

Return type

None

cleanup_blocks(node)

Curly Brace Removal: Removes empty blocks, and replaces blocks with a single child by its child

Return type

None

class KernelConstraintsCheck(type_for_symbol, check_independence_condition, check_double_write_condition=True)

Checks if the input to create_kernel is valid.

Test the following conditions:

• SSA Form for pure symbols:
• Every pure symbol may occur only once as left-hand-side of an assignment

• Every pure symbol that is read, may not be written to later

• Independence / Parallelization condition:
• a field that is written may only be read at exact the same spatial position

(Pure symbols are symbols that are not Field.Accesses)

class FieldAndIndex(field, index)
property field

Alias for field number 0

property index

Alias for field number 1

add_types(eqs, type_for_symbol, check_independence_condition)

Traverses AST and replaces every sympy.Symbol by a pystencils.typedsymbol.TypedSymbol.

Parameters
• eqs – list of equations

• type_for_symbol – dict mapping symbol names to types. Types are strings of C types like ‘int’ or ‘double’

• check_independence_condition – check that loop iterations are independent - this has to be skipped for indexed kernels

Returns

fields_read, fields_written, typed_equations set of read fields, set of written fields,

list of equations where symbols have been replaced by typed symbols

insert_casts(node)

Checks the types and inserts casts and pointer arithmetic where necessary.

Parameters

node – the head node of the ast

Returns

modified AST

remove_conditionals_in_staggered_kernel(function_node, include_first=True)

Removes conditionals of a kernel that iterates over staggered positions by splitting the loops at last or first and last element

Return type

None

typing_from_sympy_inspection(eqs, default_type='double', default_int_type='int64')

Creates a default symbol name to type mapping. If a sympy Boolean is assigned to a symbol it is assumed to be ‘bool’ otherwise the default type, usually (‘double’)

Parameters
• eqs – list of equations

• default_type – the type for non-boolean symbols

Returns

dictionary, mapping symbol name to type

get_next_parent_of_type(node, parent_type)

Returns the next parent node of given type or None, if root is reached.

Traverses the AST nodes parents until a parent of given type was found. If no such parent is found, None is returned

parents_of_type(node, parent_type, include_current=False)

Generator for all parent nodes of given type

get_optimal_loop_ordering(fields)

Determines the optimal loop order for a given set of fields. If the fields have different memory layout or different sizes an exception is thrown.

Parameters

fields – sequence of fields

Returns

list of coordinate ids, where the first list entry should be the outermost loop

get_loop_hierarchy(ast_node)

Determines the loop structure around a given AST node, i.e. the node has to be inside the loops.

Returns

sequence of LoopOverCoordinate nodes, starting from outer loop to innermost loop

get_loop_counter_symbol_hierarchy(astNode)

Determines the loop counter symbols around a given AST node. :param astNode: the AST node :return: list of loop counter symbols, where the first list entry is the symbol of the innermost loop

replace_inner_stride_with_one(ast_node)

Replaces the stride of the innermost loop of a variable sized kernel with 1 (assumes optimal loop ordering).

Variable sized kernels can handle arbitrary field sizes and field shapes. However, the kernel is most efficient if the innermost loop accesses the fields with stride 1. The inner loop can also only be vectorized if the inner stride is 1. This transformation hard codes this inner stride to one to enable e.g. vectorization.

Warning: the assumption is not checked at runtime!

Return type

None

loop_blocking(ast_node, block_size)

Blocking of loops to enhance cache locality. Modifies the ast node in-place.

Parameters
• ast_node (KernelFunction) – kernel function node before vectorization transformation has been applied

• block_size – sequence defining block size in x, y, (z) direction

Return type

int

Returns

number of dimensions blocked