FALCON is a standardized quantum-resistant digital signature scheme that offers advantages over other schemes, but features more complex sig
FALCON is a standardized quantum-resistant digital signature scheme that offers advantages over other schemes, but features more complex signature generation process. This paper presents Bi-Samplerz, a fully hardware-implemented, high-efficiency dual-path discrete Gaussian sampler designed to accelerate Falcon signature generation. Observing that the SamplerZ subroutine is consistently invoked in pairs during each signature generation, we propose a dual-datapath architecture capable of generating two sampling results simultaneously. To make the best use of coefficient correlation and the inherent properties of rejection sampling, we introduce an assistance mechanism that enables effective collaboration between the two datapaths, rather than simply duplicating the sampling process. Additionally, we incorporate several architectural optimizations over existing designs to further enhance speed, area efficiency, and resource utilization. Experimental results demonstrate that Bi-SamplerZ achieves the lowest sampling latency to date among existing designs, benefiting from fine-grained pipeline optimization and efficient control coordination. Compared with the state-of-the-art full hardware implementations, Bi-SamplerZ reduces the sampling cycle count by 54.1\% while incurring only a moderate increase in hardware resource consumption, thereby achieving the best-known area-time product (ATP) for fully hardware-based sampler designs. In addition, to facilitate comparison with existing works, we provide both ASIC and FPGA implementations. Together, these results highlight the suitability of Bi-SamplerZ as a high-performance sampling engine in standardized post-quantum cryptographic systems such as Falcon. Comment: 13pages, 3figures, submitted to TETC