llama

mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-06-02 19:16:52 +08:00

History

Georgi Gerganov ba69bbc84c

imatrix : offload to GPU support (#4957 )

* backend : add eval callback

ggml-ci

* backend : group nodes in a single compute when user don't need them

* backend : clean-up the implementation

ggml-ci

* simple : do not perform tensor data copy if not needed

* simple : fix

* imatrix : offload to GPU support

* imatrix : fix ggml_mul_mat_id hanlding

ggml-ci

* ci : add imatrix test

ggml-ci

* ci : rearrange output

ggml-ci

2024-01-17 18:46:30 +02:00

CMakeLists.txt

Importance Matrix calculation (#4861 )

2024-01-12 06:59:57 +01:00

imatrix.cpp

imatrix : offload to GPU support (#4957 )

2024-01-17 18:46:30 +02:00