19 packages found

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
  1. amd
  2. cuda
  3. deepseek
  4. gpt
  5. hpu
  6. inference
  7. inferentia
  8. llama
  9. llm
  10. llm-serving
  11. llmops
  12. mlops
  13. model-serving
  14. pytorch
  15. qwen
  16. rocm
  17. tpu
  18. trainium
  19. transformer
  20. xpu
397 Contributors
0.8.3published 5 days agoOther

io.ray:streaming-api

ray streaming api
  1. data-science
  2. deep-learning
  3. deployment
  4. distributed
  5. hyperparameter-optimization
  6. hyperparameter-search
  7. large-language-models
  8. llm
  9. llm-inference
  10. llm-serving
  11. machine-learning
  12. optimization
  13. parallel
  14. python
  15. pytorch
  16. ray
  17. reinforcement-learning
  18. rllib
  19. serving
  20. tensorflow
0.0.1published 3 years agoApache-2.0

io.ray:ray-serve

java for ray serve
  1. data-science
  2. deep-learning
  3. deployment
  4. distributed
  5. hyperparameter-optimization
  6. hyperparameter-search
  7. large-language-models
  8. llm
  9. llm-inference
  10. llm-serving
  11. machine-learning
  12. optimization
  13. parallel
  14. python
  15. pytorch
  16. ray
  17. reinforcement-learning
  18. rllib
  19. serving
  20. tensorflow
2.44.0published 3 weeks agoApache-2.0

vllm-npu

A high-throughput and memory-efficient inference and serving engine for LLMs
  1. amd
  2. cuda
  3. deepseek
  4. gpt
  5. hpu
  6. inference
  7. inferentia
  8. llama
  9. llm
  10. llm-serving
  11. llmops
  12. mlops
  13. model-serving
  14. pytorch
  15. qwen
  16. rocm
  17. tpu
  18. trainium
  19. transformer
  20. xpu
0.4.2.post2published 3 months agoApache-2.0

ant-ray

Ray provides a simple, universal API for building distributed applications.
  1. ray
  2. distributed
  3. parallel
  4. machine-learning
  5. hyperparameter-tuningreinforcement-learning
  6. deep-learning
  7. serving
  8. python
  9. data-science
  10. deployment
  11. hyperparameter-optimization
  12. hyperparameter-search
  13. large-language-models
  14. llm
  15. llm-inference
  16. llm-serving
  17. optimization
  18. pytorch
  19. reinforcement-learning
  20. rllib
  21. tensorflow
2.44.1.1published 1 week agoApache-2.0

vllm-online

A high-throughput and memory-efficient inference and serving engine for LLMs
  1. amd
  2. cuda
  3. deepseek
  4. gpt
  5. hpu
  6. inference
  7. inferentia
  8. llama
  9. llm
  10. llm-serving
  11. llmops
  12. mlops
  13. model-serving
  14. pytorch
  15. qwen
  16. rocm
  17. tpu
  18. trainium
  19. transformer
  20. xpu
580 Contributors
0.4.2published 11 months agoApache-2.0

secretflow-ray

Ray provides a simple, universal API for building distributed applications.
  1. ray
  2. distributed
  3. parallel
  4. machine-learning
  5. hyperparameter-tuningreinforcement-learning
  6. deep-learning
  7. serving
  8. python
  9. data-science
  10. deployment
  11. hyperparameter-optimization
  12. hyperparameter-search
  13. large-language-models
  14. llm
  15. llm-inference
  16. llm-serving
  17. optimization
  18. pytorch
  19. reinforcement-learning
  20. rllib
  21. tensorflow
662 Contributors
2.2.0published 2 years agoApache-2.0

ray

Ray provides a simple, universal API for building distributed applications.
  1. ray
  2. distributed
  3. parallel
  4. machine-learning
  5. hyperparameter-tuningreinforcement-learning
  6. deep-learning
  7. serving
  8. python
  9. data-science
  10. deployment
  11. hyperparameter-optimization
  12. hyperparameter-search
  13. large-language-models
  14. llm
  15. llm-inference
  16. llm-serving
  17. optimization
  18. pytorch
  19. reinforcement-learning
  20. rllib
  21. tensorflow
644 Contributors
2.44.1published 2 weeks agoApache-2.0

ray-cpp

A subpackage of Ray which provides the Ray C++ API.
  1. ray
  2. distributed
  3. parallel
  4. machine-learning
  5. hyperparameter-tuningreinforcement-learning
  6. deep-learning
  7. serving
  8. python
  9. data-science
  10. deployment
  11. hyperparameter-optimization
  12. hyperparameter-search
  13. large-language-models
  14. llm
  15. llm-inference
  16. llm-serving
  17. optimization
  18. pytorch
  19. reinforcement-learning
  20. rllib
  21. tensorflow
657 Contributors
2.44.1published 2 weeks agoApache-2.0

bentoml

BentoML: The easiest way to serve AI apps and models
  1. BentoML
  2. Compound
  3. AI
  4. Systems
  5. LLMOps
  6. MLOps
  7. Model
  8. Deployment
  9. Inference
  10. Serving
  11. ai-inference
  12. deep-learning
  13. generative-ai
  14. inference-platform
  15. llm
  16. llm-inference
  17. llm-serving
  18. machine-learning
  19. ml-engineering
  20. model-inference-service
  21. model-serving
  22. multimodal
  23. python
187 Contributors
1.4.8published 3 days agoApache-2.0
Showing 1 to 10 of 19 results