New Arrivals/Restock

vLLM in Practice: A Developer’s Guide to High-Performance Inference, Scalable Serving, and Efficient Large Language Model Deployment

flash sale iconLimited Time Sale
Until the end
06
41
37

$18.60 cheaper than the new price!!

Free shipping for purchases over $99 ( Details )
Free cash-on-delivery fees for purchases over $99
Please note that the sales price and tax displayed may differ between online and in-store. Also, the product may be out of stock in-store.
New  $31.00
quantity

Product details

Management number 220491648 Release Date 2026/05/03 List Price $12.40 Model Number 220491648
Category

This book provides a clear and practical introduction to working with vLLM, a modern framework designed for efficient large language model inference and serving. Written for developers, engineers, and technical practitioners, it focuses on building a strong understanding of how to deploy and optimize models in real-world environments.Starting with the fundamentals of large language model inference, the book explains how vLLM improves throughput and memory efficiency through advanced scheduling and execution strategies. Readers will explore core concepts such as tokenization pipelines, batching techniques, and latency optimization, all presented in a structured and accessible manner.As the material progresses, the focus shifts toward hands-on implementation. You will learn how to configure vLLM for different workloads, integrate it into existing systems, and manage performance across a variety of deployment scenarios. Practical examples illustrate how to balance resource usage with responsiveness, making it easier to build scalable AI-powered applications.The book also addresses important operational considerations, including monitoring, debugging, and maintaining reliability in production systems. By the end, readers will have a solid foundation for using vLLM effectively, whether for experimentation, prototyping, or full-scale deployment.This guide is intended for those who want a focused, technically grounded resource without unnecessary complexity, providing a reliable pathway into modern LLM serving workflows. Read more

ISBN13 979-8253924297
Language English
Publisher Independently published
Dimensions 7.24 x 0.52 x 10.24 inches
Item Weight 12.2 ounces
Print length 145 pages
Publication date March 27, 2026

Correction of product information

If you notice any omissions or errors in the product information on this page, please use the correction request form below.

Correction Request Form

Product Review

You must be logged in to post a review