# deep-sci.github.io ## Transformer and Attention ## State Space Models ## RoPE: Rotary Position Embedding ## Scaling Monosemanticity ## Leave No Context Behind: Infini-attention ## Diffusion Models ## ORPO: Monolithic Preference Optimization without Reference Model