A curated paper list of awesome industry papers from various giant database vendors and other awesomeness, for database researchers/engineers.
The repository is under construction. Welcome new PR, please conform to the committed rules:
paperName(with pdf link) [MeetingName Year] Github link if it has open-sourced code (optional)
Thanks to all authors of the paper/repository I cite :D
- Progressive Partitioning for Parallelized Query Execution in Google’s Napa [VLDB 23]
- Keep Your Distributed Data Warehouse Consistent at a Minimal Cost [SIGMOD 23]
- Amazon Redshift and the Case for Simpler Data Warehouses [SIGMOD 15]
- Amazon Redshift Re-invented [SIGMOD 22]
- Amazon DynamoDB: A Scalable, Predictably Performant, and Fully Managed NoSQL Database Service [OSDI 22]
- The Story of AWS Glue [VLDB 23]
- Auto-WLM: ML-enhanced workload management in Amazon Redshift [SIGMOD 23]
- Resource Management in Aurora Serverless [VLDB 24]
- Angel-PTM: A Scalable and Economical Large-scale Pre-training System in Tencent [VLDB 23]
- EmbedX: A Versatile, Efficient and Scalable Platform to Embed Both Graphs and High-Dimensional Sparse Data [VLDB 23]
- Towards General and Efficient Online Tuning for Spark [VLDB 23]
- TDSQL: Tencent Distributed Database System [VLDB 24]
- Eigen: End-to-end Resource Optimization for Large-Scale Databases on the Cloud [VLDB 23]
- Anser: Adaptive Information Sharing Framework of AnalyticDB [VLDB 23]
- Lindorm TSDB: A Cloud-native Time-series Database for Large-scale Monitoring Systems [VLDB 23]
- Vineyard: Optimizing Data Sharing in Data-Intensive Analytics [SIGMOD 23]
- Flux: Decoupled Auto-Scaling for Heterogeneous Query Workload in Alibaba AnalyticDB [SIGMOD 24]
- PolarDB-SCC: A Cloud-Native Database Ensuring Low Latency for Strongly Consistent Reads [VLDB 23]
- PolarDB-IMCI:A Cloud-Native HTAP Database System at Alibaba [SIGMOD 23]
- PolarDB-MP: A Multi-Primary Cloud-Native Database via Disaggregated Shared Memory [SIGMOD 24]
- Automatic SQL Error Mitigation in Oracle [VLDB 23]
- Grouping, Subsumption, and Disjunctive Join Optimizations in Oracle [VLDB 24]
- ByteHTAP: ByteDance’s HTAP System with High Data Freshness and Strong Data Consistency [VLDB 22]
- Krypton: Real-time Serving and Analytical SQL Engine at ByteDance [VLDB 23]
- VeDB: A Software and Hardware Enabled Trusted Relational Database [SIGMOD 23]
- LavaStore: ByteDance's Purpose-built, High-performance, Cost-effective Local Storage Engine for Cloud Services [VLDB 24]
- Taurus MM: bringing multi-master to the cloud [VLDB 23]
- GaussDB: A Cloud-Native Multi-Primary Database with Compute-Memory-Storage Disaggregation [VLDB 24]
- POLARIS: The Distributed SQL Engine in Azure Synapse [VLDB 20]
- Microsoft Purview: A System for Central Governance of Data [VLDB 23]
- OneProvenance: Efficient Extraction of Dynamic Coarse-Grained Provenance From Database Query Event Logs [VLDB 23]
- Towards Building Autonomous Data Services on Azure [SIGMOD 23]
- Presto: A Decade of SQL Analytics at Meta [SIGMOD 23]
- Disaggregating RocksDB: A Production Experience [SIGMOD 23]
- The Snowflake Elastic Data Warehouse [SIGMOD 16]
- Building An Elastic Query Engine on Disaggregated Storage [OSDI 20]
- What’s the difference? Incremental processing with change queries in Snowflake [SIGMOD 23]