11 packages found

poe-api-wrapper

A simple, lightweight and efficient API wrapper for Poe.com
  1. python
  2. poe
  3. quora
  4. chatgpt
  5. claude
  6. poe-api
  7. api
  8. chatbot
  9. code-llama
  10. dall-e
  11. gemini
  12. gpt-4
  13. groq
  14. llama
  15. mistral
  16. openai
  17. palm2
  18. qwen
  19. reverse-engineering
  20. stable-diffusion
5 Contributors
1.7.0published 5 months agoGPL-3.0

node-llama-cpp

Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema on the model output on the generation level
  1. llama
  2. llama-cpp
  3. llama.cpp
  4. bindings
  5. ai
  6. cmake
  7. cmake-js
  8. prebuilt-binaries
  9. llm
  10. gguf
  11. metal
  12. cuda
  13. vulkan
  14. grammar
  15. embedding
  16. rerank
  17. reranking
  18. json-grammar
  19. json-schema-grammar
  20. functions
  21. function-calling
  22. token-prediction
  23. speculative-decoding
  24. temperature
  25. minP
  26. topK
  27. topP
  28. seed
  29. json-schema
  30. raspberry-pi
  31. self-hosted
  32. local
  33. catai
  34. mistral
  35. deepseek
  36. qwen
  37. qwq
  38. typescript
  39. lora
  40. batching
  41. gpu
  42. nodejs
3.7.0published 3 weeks agoMIT

vllm-npu

A high-throughput and memory-efficient inference and serving engine for LLMs
  1. amd
  2. cuda
  3. deepseek
  4. gpt
  5. hpu
  6. inference
  7. inferentia
  8. llama
  9. llm
  10. llm-serving
  11. llmops
  12. mlops
  13. model-serving
  14. pytorch
  15. qwen
  16. rocm
  17. tpu
  18. trainium
  19. transformer
  20. xpu
0.4.2.post2published 3 months agoApache-2.0

llmtuner

Easy-to-use LLM fine-tuning framework
  1. LLaMA
  2. BLOOM
  3. Falcon
  4. LLM
  5. ChatGPT
  6. transformer
  7. pytorch
  8. deep
  9. learning
  10. agent
  11. ai
  12. chatglm
  13. fine-tuning
  14. gpt
  15. instruction-tuning
  16. language-model
  17. large-language-models
  18. llama3
  19. lora
  20. mistral
  21. moe
  22. peft
  23. qlora
  24. quantization
  25. qwen
  26. rlhf
  27. transformers
118 Contributors
0.7.1published 11 months agoApache-2.0

zh-langchain

Chinese language processing library
  1. chatbot
  2. chatchat
  3. chatglm
  4. chatgpt
  5. embedding
  6. faiss
  7. fastchat
  8. gpt
  9. knowledge-base
  10. langchain
  11. langchain-chatglm
  12. llama
  13. llm
  14. milvus
  15. ollama
  16. qwen
  17. rag
  18. retrieval-augmented-generation
  19. streamlit
  20. xinference
45 Contributors
0.2.1published 2 years agoMIT

tilearn-infer

A high-throughput and memory-efficient inference and serving engine for LLMs
  1. amd
  2. cuda
  3. deepseek
  4. gpt
  5. hpu
  6. inference
  7. inferentia
  8. llama
  9. llm
  10. llm-serving
  11. llmops
  12. mlops
  13. model-serving
  14. pytorch
  15. qwen
  16. rocm
  17. tpu
  18. trainium
  19. transformer
  20. xpu
606 Contributors
0.3.3published 1 year agoApache-2.0

vllm-online

A high-throughput and memory-efficient inference and serving engine for LLMs
  1. amd
  2. cuda
  3. deepseek
  4. gpt
  5. hpu
  6. inference
  7. inferentia
  8. llama
  9. llm
  10. llm-serving
  11. llmops
  12. mlops
  13. model-serving
  14. pytorch
  15. qwen
  16. rocm
  17. tpu
  18. trainium
  19. transformer
  20. xpu
580 Contributors
0.4.2published 1 year agoApache-2.0

xfastertransformer-devel

Boost large language model inference performance on CPU platform.
  1. LLM
  2. chatglm
  3. inference
  4. intel
  5. llama
  6. model-serving
  7. qwen
  8. transformer
  9. xeon
24 Contributors
1.8.1.1published 8 months agoApache-2.0

xinference

Model Serving Made Easy
  1. artificial-intelligence
  2. chatglm
  3. deployment
  4. flan-t5
  5. gemma
  6. ggml
  7. glm4
  8. inference
  9. llama
  10. llama3
  11. llamacpp
  12. llm
  13. machine-learning
  14. mistral
  15. openai-api
  16. pytorch
  17. qwen
  18. vllm
  19. whisper
  20. wizardlm
99 Contributors
1.4.1published 2 weeks agoApache-2.0

nextai-vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
  1. amd
  2. cuda
  3. deepseek
  4. gpt
  5. hpu
  6. inference
  7. inferentia
  8. llama
  9. llm
  10. llm-serving
  11. llmops
  12. mlops
  13. model-serving
  14. pytorch
  15. qwen
  16. rocm
  17. tpu
  18. trainium
  19. transformer
  20. xpu
615 Contributors
0.0.7published 1 year agoApache-2.0
Showing 1 to 10 of 11 results