First of all you need to make certain that you have NVIDIA CUDA 12+ installed – DO NOT just install it with other CUDA versions but remove the old and install the new.
Next, for these installations you need to have Python 3.10.* – if you are not using Python 3.10 on your base system you can still use miniconda to create a python 3.10 virtual environment and install these commands in there. To do that you would type:
conda create –name <whatever_name_you_want> python=3.10.11 -y
conda activate <whatever_name_you_want>
NOW, to install Flash-Attn, Triton, and Sage-Attention
pip install sageattention
pip install https://huggingface.co/lldacing/flash-attention-windows-wheel/resolve/main/flash_attn-2.7.4+cu126torch2.6.0cxx11abiFALSE-cp310-cp310-win_amd64.whl”
pip install https://github.com/woct0rdho/triton-windows/releases/download/v3.2.0-windows.post10/triton-3.2.0-cp310-cp310-win_amd64.whl
Leave a Reply