Skip to content
@MachineLearningSystem

MachineLearningSystem

Popular repositories Loading

  1. 25ASPLOS-Medusa 25ASPLOS-Medusa Public

    Forked from thustorage/Medusa

    Medusa: Accelerating Serverless LLM Inference with Materialization [ASPLOS'25]

    HTML 7

  2. 24MLSYS-prompt-cache 24MLSYS-prompt-cache Public

    Forked from yale-sys/prompt-cache

    Modular and structured prompt caching for low-latency LLM inference

    Python 6

  3. 24PPOPP-Liger 24PPOPP-Liger Public

    C++ 5

  4. Optimus-CC Optimus-CC Public

    [ASPLOS'23] Optimus-CC: Efficient Large NLP Model Training with 3D Parallelism Aware Communication Compression

    Python 3 4

  5. ATC23-Legion ATC23-Legion Public

    Forked from JIESUN233/Legion

    RC4ML GNN System Projects

    C++ 3

Repositories

Showing 10 of 609 repositories
  • OSDI24-llumnix Public Forked from AlibabaPAI/llumnix

    Efficient and easy multi-instance LLM serving

    MachineLearningSystem/OSDI24-llumnix’s past year of commit activity
    Python 1 Apache-2.0 27 0 0 Updated Mar 10, 2025
  • mixtera Public Forked from eth-easl/mixtera

    A lightweight, user-friendly data-plane for LLM training.

    MachineLearningSystem/mixtera’s past year of commit activity
    Python 0 MIT 3 0 0 Updated Mar 3, 2025
  • 25MLSYS-NEO Public Forked from NEO-MLSys25/NEO

    NEO is a LLM inference engine built to save the GPU memory crisis by CPU offloading

    MachineLearningSystem/25MLSYS-NEO’s past year of commit activity
    Python 0 Apache-2.0 6 0 0 Updated Mar 2, 2025
  • FlexPrefill Public Forked from bytedance/FlexPrefill

    Code for paper: [ICLR2025 Oral] FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference

    MachineLearningSystem/FlexPrefill’s past year of commit activity
    Python 0 Apache-2.0 2 0 0 Updated Feb 28, 2025
  • DeepEP Public Forked from deepseek-ai/DeepEP

    DeepEP: an efficient expert-parallel communication library

    MachineLearningSystem/DeepEP’s past year of commit activity
    Cuda 0 MIT 670 0 0 Updated Feb 25, 2025
  • 25Eurosys-NeuStream-AE Public Forked from Fjallraven-hc/NeuStream-AE

    Artifact Evaluation

    MachineLearningSystem/25Eurosys-NeuStream-AE’s past year of commit activity
    Python 0 MIT 1 0 0 Updated Feb 24, 2025
  • OSDI24-sarathi-serve Public Forked from microsoft/sarathi-serve

    A low-latency & high-throughput serving engine for LLMs

    MachineLearningSystem/OSDI24-sarathi-serve’s past year of commit activity
    Python 0 Apache-2.0 42 0 0 Updated Jan 31, 2025
  • 25Eurosys-JABAS Public Forked from unist-ssl/JABAS

    "JABAS: Joint Adaptive Batching and Automatic Scaling for DNN Training on Heterogeneous GPUs" (EuroSys '25)

    MachineLearningSystem/25Eurosys-JABAS’s past year of commit activity
    Python 0 MIT 1 0 0 Updated Jan 22, 2025
  • MoE-Infinity Public Forked from EfficientMoE/MoE-Infinity

    PyTorch library for cost-effective, fast and easy serving of MoE models.

    MachineLearningSystem/MoE-Infinity’s past year of commit activity
    Python 0 Apache-2.0 12 0 0 Updated Jan 19, 2025
  • glinthawk Public Forked from microsoft/glinthawk

    An LLM inference engine, written in C++

    MachineLearningSystem/glinthawk’s past year of commit activity
    C++ 0 MIT 2 0 0 Updated Jan 17, 2025

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…