Aerial person re-identification (AReID) focuses on accurately matching target person images within a UAV camera network. Challenges arise due to the broad field of view and arbitrary movement of UAVs, leading to foreground target rotation and background style variation. Existing AReID methods have provided limited solutions for the former, while the latter remains largely unexplored. This paper propose a Rotation Exploration Vision Transformer (RoExViT) to tackle the aforementioned dual challenges. Specifically, we design Multiple Rotation Tokens (MRT) to explore diverse rotational representations at the feature level, addressing foreground target rotation. To handle background style variation, we propose Cross-Camera Similarity (CCS) loss to effectively minimize the view gap among different cameras. Furthermore, we propose Iteratively Adaptive Batch Construction (IABC) strategy to mitigate overfitting on small datasets. Extensive experiments show that our method outperforms the state-of-the-art methods on PRAI-1581 and UAV-Human while also exhibting outstanding performance on Market1501.