LLM-Attention (serial version)

Version's name: LLM-Attention (serial version) ; a version of the LLM-Attention kernel program.
Repository: [home] and version downloads: [.zip] [.tar.gz] [.tar.bz2] [.tar]

This version is a baseline implementation of the LLM-Attention phase of a Large Language Model application. It includes mathematical services as the matrix multiplication, the matrix transpose, and the softmax computation.

Pre-requisites

(optional) The Extrae library, to generate Paraver traces

Several configure files are included in the source code distribution.

Building the kernel

In order to build the program you should execute the make program:

$> make [ENVIRONMENT]

Where the ENVIRONMENT options can be:

WITH_EXTRAE={true | false}

The generated binary will be suffixed with the options provided above.

Executing the kernel

To run the program you must include in the command line the size of the context and the number of dimensions. For instance:

$> ./attention <context_size> <dim>

The following experiments have been registered:

LLM-Attengion, EPI Co-design on RISC-V platforms