Layer Parallelism: Enhancing LLM Inference Efficiency Through Parallel Execution of Transformer Layers February 15, 2025 by theverge