llama-cli Crash Analysis

Command

llama-cli -hf ggml-org/gemma-3-4b-it-GGUF -ngl 99 -fa on

Output Log

ggml_cuda_init: found 1 ROCm devices:
  Device 0: AMD Radeon Graphics, gfx1100 (0x1100), VMM: no, Wave Size: 32
common_download_file_single_online: no previous model file found /home/audstanley/.cache/llama.cpp/ggml-org_gemma-3-4b-it-GGUF_preset.ini
common_download_file_single_online: HEAD invalid http status code received: 404
no remote preset found, skipping
common_download_file_single_online: using cached file: /home/audstanley/.cache/llama.cpp/ggml-org_gemma-3-4b-it-GGUF_gemma-3-4b-it-Q4_K_M.gguf
common_download_file_single_online: using cached file: /home/audstanley/.cache/llama.cpp/ggml-org_gemma-3-4b-it-GGUF_mmproj-model-f16.gguf

Loading model... \
|[New LWP 5101]
[New LWP 5096]
[New LWP 5095]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
__syscall_cancel_arch () at ../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S:56
warning: 56     ../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S: No such file or directory
#0  __syscall_cancel_arch () at ../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S:56
56      in ../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S
#1  0x000075a9fd4a013c in __internal_syscall_cancel (a1=<optimized out>, a2=<optimized out>, a3=<optimized out>, a4=<optimized out>, a5=0, a6=0, nr=61) at ./nptl/cancellation.c:49
warning: 49     ./nptl/cancellation.c: No such file or directory
#2  __syscall_cancel (a1=<optimized out>, a2=<optimized out>, a3=<optimized out>, a4=<optimized out>, a5=a5@entry=0, a6=a6@entry=0, nr=61) at ./nptl/cancellation.c:75
75      in ./nptl/cancellation.c
#3  0x000075a9fd51c98f in __GI___wait4 (pid=<optimized out>, stat_loc=<optimized out>, options=<optimized out>, usage=<optimized out>) at ../sysdeps/unix/sysv/linux/wait4.c:30
warning: 30     ../sysdeps/unix/sysv/linux/wait4.c: No such file or directory
#4  0x000075a9fe675a53 in ggml_print_backtrace () from /usr/local/lib/libggml-base.so.0
#5  0x000075a9fe688d4f in ggml_uncaught_exception() () from /usr/local/lib/libggml-base.so.0
#6  0x000075a9fd8c11ea in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#7  0x000075a9fd8aaa9c in std::terminate() () from /lib/x86_64-linux-gnu/libstdc++.so.6
#8  0x0000579ad2b24eb1 in std::thread::~thread() ()
#9  0x000075a9fd4488a1 in __run_exit_handlers (status=130, listp=0x75a9fd634680 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at ./stdlib/exit.c:118
warning: 118     ./stdlib/exit.c: No such file or directory
#10 0x000075a9fd44897e in __GI_exit (status=<optimized out>) at ./stdlib/exit.c:148
148     in ./stdlib/exit.c
#11 0x0000579ad29681ff in signal_handler(int) ()
#12 <signal handler called>
#13 0x000075a9bc6779ca in ?? () from /opt/rocm/lib/libhsa-runtime64.so.1
#14 0x000075a9bc67784e in ?? () from /opt/rocm/lib/libhsa-runtime64.so.1
#15 0x000075a9bc66adc1 in ?? () from /opt/rocm/lib/libhsa-runtime64.so.1
#16 0x000075a9bc65b08b in ?? () from /opt/rocm/lib/libhsa-runtime64.so.1
#17 0x000075a9bc6494d1 in ?? () from /opt/rocm/lib/libhsa-runtime64.so.1
#18 0x000075a9bc65e399 in ?? () from /opt/rocm/lib/libhsa-runtime64.so.1
#19 0x000075a9bc6c47c2 in ?? () from /opt/rocm/lib/libhsa-runtime64.so.1
#20 0x000075a9bc6c593b in ?? () from /opt/rocm/lib/libhsa-runtime64.so.1
#21 0x000075a9bc66d5db in ?? () from /opt/rocm/lib/libhsa-runtime64.so.1
#22 0x000075a9f5052f03 in ?? () from /opt/rocm/lib/libamdhip64.so.7
#23 0x000075a9f5001025 in ?? () from /opt/rocm/lib/libamdhip64.so.7
#24 0x000075a9f5035612 in ?? () from /opt/rocm/lib/libamdhip64.so.7
#25 0x000075a9f4ffdf05 in ?? () from /opt/rocm/lib/libamdhip64.so.7
#26 0x000075a9f50422f3 in ?? () from /opt/rocm/lib/libamdhip64.so.7
#27 0x000075a9f508971a in ?? () from /opt/rocm/lib/libamdhip64.so.7
#28 0x000075a9f50621b5 in ?? () from /opt/rocm/lib/libamdhip64.so.7
#29 0x000075a9f5044ce3 in ?? () from /opt/rocm/lib/libamdhip64.so.7
#30 0x000075a9f502a800 in ?? () from /opt/rocm/lib/libamdhip64.so.7
#31 0x000075a9f4f2da1c in ?? () from /opt/rocm/lib/libamdhip64.so.7
#32 0x000075a9f4f2dcba in ?? () from /opt/rocm/lib/libamdhip64.so.7
#33 0x000075a9f4f3b3fa in ?? () from /opt/rocm/lib/libamdhip64.so.7
#34 0x000075a9fcc920f0 in ggml_backend_cuda_synchronize(ggml_backend*) () from /usr/local/lib/libggml-hip.so.0
#35 0x000075a9fe68e84e in ggml_backend_sched_synchronize () from /usr/local/lib/libggml-base.so.0
#36 0x000075a9fe6913de in ggml_backend_sched_reserve_size () from /usr/local/lib/libggml-base.so.0
#37 0x000075a9fdcbafee in llama_context::graph_reserve(unsigned int, unsigned int, unsigned int, llama_memory_context_i const*, bool, unsigned long*) () from /usr/local/lib/libllama.so.0
#38 0x000075a9fdcbbbc4 in llama_context::sched_reserve() () from /usr/local/lib/libllama.so.0
#39 0x000075a9fdcbfc65 in llama_context::llama_context(llama_model const&, llama_context_params) () from /usr/local/lib/libllama.so.0
#40 0x000075a9fdcc0857 in llama_init_from_model () from /usr/local/lib/libllama.so.0
#41 0x000075a9fdc94bd1 in llama_get_device_memory_data(char const*, llama_model_params const*, llama_context_params const*, std::vector<ggml_backend_device*, std::allocator<ggml_backend_device*> >&, unsigned int&, unsigned int&, unsigned int&, ggml_log_level) () from /usr/local/lib/libllama.so.0
#42 0x000075a9fdc95d54 in llama_params_fit_impl(char const*, llama_model_params*, llama_context_params*, float*, llama_model_tensor_buft_override*, unsigned long*, unsigned int, ggml_log_level) () from /usr/local/lib/libllama.so.0
#43 0x000075a9fdc99b82 in llama_params_fit () from /usr/local/lib/libllama.so.0
#44 0x0000579ad2b19ef5 in common_init_result::common_init_result(common_params&) ()
#45 0x0000579ad2b1c8ea in common_init_from_params(common_params&) ()
#46 0x0000579ad2a3ae0d in server_context_impl::load_model(common_params const&) ()
#47 0x0000579ad2961701 in main ()
[Inferior 1 (process 5084) detached]
/terminate called without an active exception
[1]    5084 IOT instruction (core dumped)  llama-cli -hf ggml-org/gemma-3-4b-it-GGUF -ngl 99 -fa on

Analysis

This is a GPU driver/ROCm compatibility issue with llama.cpp on AMD hardware. The crash occurs in the ROCm HSA runtime during GPU initialization.

Key Indicators:

  1. Update ROCm/Drivers:

    sudo apt update && sudo apt upgrade rocm-dkms
    
  2. Use CPU-only Mode:

    llama-cli -hf ggml-org/gemma-3-4b-it-GGUF -ngl 0
    
  3. Reduce GPU Layers:

    llama-cli -hf ggml-org/gemma-3-4b-it-GGUF -ngl 1
    # or
    llama-cli -hf ggml-org/gemma-3-4b-it-GGUF -ngl 10
    
  4. Check ROCm Installation:

    rocminfo && rocm-smi
    
  5. Verify Environment Variables:

    export ROCM_PATH=/opt/rocm
    export HSA_OVERRIDE_GFX_VERSION=11.0.0
    

Technical Details:

The gfx1100 (RDNA3) architecture may need: - Newer ROCm versions for stability - Specific kernel patches - Proper HSA runtime configuration

The crash happens during ggml_backend_cuda_synchronize() which suggests the GPU synchronization is failing, likely due to driver incompatibilities or memory management issues in the ROCm stack.