Abstract: The Transformer architecture, despite its scaling law, faces expensive computational cost challenges as the number of parameters increases. Quantization methods like Ternary-BERT and BitNet ...
Abstract: In this paper, we propose three modular multiplication algorithms that use only the IEEE 754 binary floating-point operations. Several previous studies have used floating-point operations to ...