Compiler
trait, which simply specifies a single function:
add(a, mul(b, -1))
. We can have a compiler that looks for that pattern of nodes and directly replaces it with a Subtract
operation. We’ll look at how to do this in the Writing Compilers section.
All you need to know for now is that we can use this compiler on the graph by doing:
- GenericCompiler - A handful of hardware-agnostic optimizations like CSE to be ran before any hardware-specific compilers.
- CudaCompiler<T> - The full stack of cuda compilers to convert a graph to a cuda-specialized graph with T as the datatype (either f32 or f16). Imported from luminal_cuda.
- MetalCompiler<T> - Same as CudaCompiler. Imported from luminal_metal.