Hardware architectures for local learning of large language models: an overview of scalable solutions
Annotation
The article discusses which hardware accelerators are advisable to use if you need to train large language models on your own servers, rather than in the cloud. The main types of accelerators are analyzed — GPU, ASIC, FPGA and others. The cost of ownership of different solutions for local use when installing 256 accelerators is compared separately. As a result, it is concluded that the choice of equipment should take into account not only the speed of calculations, but also the costs of electricity, cooling and software compatibility.
more