Space-based gravitational wave missions are expected to detect a statistically significant population of gravitational wave lensing events, offering new opportunities for cosmology and fundamental physics. In the millihertz band, the GW wavelength can be comparable to the Schwarzschild radius of the lens, giving rise to prominent wave-optics effects. To meet the need for rapid identification among large volumes of candidate events in future surveys, conventional matched filtering and Bayesian parameter estimation pipelines are often computationally expensive and thus less suited to fast pre-screening. In this work, we propose an extended long short-term memory (xLSTM)-based lensing feature extraction model. By processing whitened frequency-domain data from the A and E channels of space-borne detectors, the model leverages a matrix-memory structure to effectively capture diffraction-induced amplitude patterns across the millihertz band. Experiments on a mixed lensing dataset show that the proposed method achieves AUC > 0.99 and provides improved detection capability over baseline models in the low false positive rate regime. The performance remains stable under representative lens models, including the point-mass and the singular isothermal sphere lens, demonstrating good cross-model generalization; the method also retains an advantage at low signal-to-noise ratios. Owing to its low inference-time cost, this approach can serve as an efficient pre-screening tool for future searches for lensed GW candidates in space-based observations.