Some of the largest datasets have strong time components, like machine monitoring, real-time alert and IoT devices, etc. Despite of so many applications of time series data, most storage options are either highly proprietary or worse, relational. Unlike other alternatives, TideDB does not have a data with multiple metrics broken down into multiple data with one metric that increases the pressure on system throughput dramatically, rather its data modeling based on the computed column and tag words index can provide high write throughput, low read latency, and petabytes storage. TideDB has been deployed in production settings on large clusters to manage multiple terabytes of storage at Taide Company. The paper describes the TideDB how to store and organize our time series data from about one hundred thousand devices and millions service modules.
Published in | Internet of Things and Cloud Computing (Volume 5, Issue 3) |
DOI | 10.11648/j.iotcc.20170503.14 |
Page(s) | 59-63 |
Creative Commons |
This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited. |
Copyright |
Copyright © The Author(s), 2017. Published by Science Publishing Group |
Distributed, Scalability, Time Series, Internet of Things, Metric, Performance
[1] | Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber. Bigtable: A Distributed Storage System for Structured Data. |
[2] | TideCloud, http://tidecloud.org/ |
[3] | InfluxDB, https://www.influxdata.com/ |
[4] | OpenTSDB, http://opentsdb.net/ |
[5] | MongoDB, http://www.mongodb.org/ |
[6] | Apache Cassandra, http://cassandra.apache.org/. |
[7] | Avinash Lakshman, Prashant Malik. Cassandra - A Decentralized Structured Storage System. |
[8] | Apache HBase, http://hbase.apache.org/. |
[9] | Rick Cattell. Scalable SQL and NoSQL Data Stores. |
[10] | James Cipar, Greg Ganger, Kimberly Keeton, Charles B. Morrey III, Craig A. N. Soules, Alistair Veitch. LazyBase: Trading Freshness for Performance in a Scalable Database. |
[11] | Amazon SimpleDB, http://aws.amazon.com/simpledb/. |
[12] | Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall and Werner Vogels. Dynamo: Amazon's Highly Available Key-value Store. |
[13] | Sybase, http://www.sybase.com/products/databasemanagement/adaptiveserverenterprise. |
[14] | Rick Cattell. High Performance Scalable Data Stores. |
[15] | Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. The Google File System. |
[16] | Brad Calder, Ju Wang, Aaron Ogus, Niranjan Nilakantan, Arild Skjolsvold, Sam McKelvie, Yikang Xu, Shashwat Srivastav, Jiesheng Wu, Huseyin Simitci, Jaidev Haridas, Chakravarthy Uddaraju, Hemal Khatri, Andrew Edwards, Vaman Bedekar, Shane Mainali, Rafay Abbasi, Arpit Agarwal, Mian Fahim ul Haq, Muhammad Ikram ul Haq, Deepali Bhardwaj, Sowmya Dayanand, Anitha Adusumilli, Marvin McNett, Sriram Sankaran, Kavitha Manivannan, Leonidas Rigas. Windows Azure Storage: A Highly Available Cloud Storage Service with Strong Consistency. |
[17] | Daniel J. Abadi Samuel R. Madden Nabil Hachem. Column-Stores vs. Row-Stores: How Different Are They Really? |
[18] | Voldemort, http://project-voldemort.com/design.php |
[19] | Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, Russell Sears. Benchmarking Cloud Serving Systems with YCSB. |
APA Style
Xue Yingfei. (2017). TideDB - A Distributed, Scalable Time Series Database. Internet of Things and Cloud Computing, 5(3), 59-63. https://doi.org/10.11648/j.iotcc.20170503.14
ACS Style
Xue Yingfei. TideDB - A Distributed, Scalable Time Series Database. Internet Things Cloud Comput. 2017, 5(3), 59-63. doi: 10.11648/j.iotcc.20170503.14
AMA Style
Xue Yingfei. TideDB - A Distributed, Scalable Time Series Database. Internet Things Cloud Comput. 2017;5(3):59-63. doi: 10.11648/j.iotcc.20170503.14
@article{10.11648/j.iotcc.20170503.14, author = {Xue Yingfei}, title = {TideDB - A Distributed, Scalable Time Series Database}, journal = {Internet of Things and Cloud Computing}, volume = {5}, number = {3}, pages = {59-63}, doi = {10.11648/j.iotcc.20170503.14}, url = {https://doi.org/10.11648/j.iotcc.20170503.14}, eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.iotcc.20170503.14}, abstract = {Some of the largest datasets have strong time components, like machine monitoring, real-time alert and IoT devices, etc. Despite of so many applications of time series data, most storage options are either highly proprietary or worse, relational. Unlike other alternatives, TideDB does not have a data with multiple metrics broken down into multiple data with one metric that increases the pressure on system throughput dramatically, rather its data modeling based on the computed column and tag words index can provide high write throughput, low read latency, and petabytes storage. TideDB has been deployed in production settings on large clusters to manage multiple terabytes of storage at Taide Company. The paper describes the TideDB how to store and organize our time series data from about one hundred thousand devices and millions service modules.}, year = {2017} }
TY - JOUR T1 - TideDB - A Distributed, Scalable Time Series Database AU - Xue Yingfei Y1 - 2017/08/07 PY - 2017 N1 - https://doi.org/10.11648/j.iotcc.20170503.14 DO - 10.11648/j.iotcc.20170503.14 T2 - Internet of Things and Cloud Computing JF - Internet of Things and Cloud Computing JO - Internet of Things and Cloud Computing SP - 59 EP - 63 PB - Science Publishing Group SN - 2376-7731 UR - https://doi.org/10.11648/j.iotcc.20170503.14 AB - Some of the largest datasets have strong time components, like machine monitoring, real-time alert and IoT devices, etc. Despite of so many applications of time series data, most storage options are either highly proprietary or worse, relational. Unlike other alternatives, TideDB does not have a data with multiple metrics broken down into multiple data with one metric that increases the pressure on system throughput dramatically, rather its data modeling based on the computed column and tag words index can provide high write throughput, low read latency, and petabytes storage. TideDB has been deployed in production settings on large clusters to manage multiple terabytes of storage at Taide Company. The paper describes the TideDB how to store and organize our time series data from about one hundred thousand devices and millions service modules. VL - 5 IS - 3 ER -