TideDB - A Distributed, Scalable Time Series Database

Xue Yingfei

doi:doi:10.11648/j.iotcc.20170503.14

| Peer-Reviewed

TideDB - A Distributed, Scalable Time Series Database

Xue Yingfei

Published in Internet of Things and Cloud Computing (Volume 5, Issue 3)

Received: 7 August 2017 Published: 7 August 2017

Views: Downloads:

Download PDF

Share This Article

Twitter
Linked In
Facebook

Abstract

Some of the largest datasets have strong time components, like machine monitoring, real-time alert and IoT devices, etc. Despite of so many applications of time series data, most storage options are either highly proprietary or worse, relational. Unlike other alternatives, TideDB does not have a data with multiple metrics broken down into multiple data with one metric that increases the pressure on system throughput dramatically, rather its data modeling based on the computed column and tag words index can provide high write throughput, low read latency, and petabytes storage. TideDB has been deployed in production settings on large clusters to manage multiple terabytes of storage at Taide Company. The paper describes the TideDB how to store and organize our time series data from about one hundred thousand devices and millions service modules.

Published in	Internet of Things and Cloud Computing (Volume 5, Issue 3)
DOI	10.11648/j.iotcc.20170503.14
Page(s)	59-63
Creative Commons	This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.
Copyright	Copyright © The Author(s), 2017. Published by Science Publishing Group

Keywords

Distributed, Scalability, Time Series, Internet of Things, Metric, Performance

References

[1]	Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber. Bigtable: A Distributed Storage System for Structured Data.
[2]	TideCloud, http://tidecloud.org/
[3]	InfluxDB, https://www.influxdata.com/
[4]	OpenTSDB, http://opentsdb.net/
[5]	MongoDB, http://www.mongodb.org/
[6]	Apache Cassandra, http://cassandra.apache.org/.
[7]	Avinash Lakshman, Prashant Malik. Cassandra - A Decentralized Structured Storage System.
[8]	Apache HBase, http://hbase.apache.org/.
[9]	Rick Cattell. Scalable SQL and NoSQL Data Stores.
[10]	James Cipar, Greg Ganger, Kimberly Keeton, Charles B. Morrey III, Craig A. N. Soules, Alistair Veitch. LazyBase: Trading Freshness for Performance in a Scalable Database.
[11]	Amazon SimpleDB, http://aws.amazon.com/simpledb/.
[12]	Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall and Werner Vogels. Dynamo: Amazon's Highly Available Key-value Store.
[13]	Sybase, http://www.sybase.com/products/databasemanagement/adaptiveserverenterprise.
[14]	Rick Cattell. High Performance Scalable Data Stores.
[15]	Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. The Google File System.
[16]	Brad Calder, Ju Wang, Aaron Ogus, Niranjan Nilakantan, Arild Skjolsvold, Sam McKelvie, Yikang Xu, Shashwat Srivastav, Jiesheng Wu, Huseyin Simitci, Jaidev Haridas, Chakravarthy Uddaraju, Hemal Khatri, Andrew Edwards, Vaman Bedekar, Shane Mainali, Rafay Abbasi, Arpit Agarwal, Mian Fahim ul Haq, Muhammad Ikram ul Haq, Deepali Bhardwaj, Sowmya Dayanand, Anitha Adusumilli, Marvin McNett, Sriram Sankaran, Kavitha Manivannan, Leonidas Rigas. Windows Azure Storage: A Highly Available Cloud Storage Service with Strong Consistency.
[17]	Daniel J. Abadi Samuel R. Madden Nabil Hachem. Column-Stores vs. Row-Stores: How Different Are They Really?
[18]	Voldemort, http://project-voldemort.com/design.php
[19]	Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, Russell Sears. Benchmarking Cloud Serving Systems with YCSB.

Cite This Article

Plain Text BibTeX RIS

APA Style

Xue Yingfei. (2017). TideDB - A Distributed, Scalable Time Series Database. Internet of Things and Cloud Computing, 5(3), 59-63. https://doi.org/10.11648/j.iotcc.20170503.14

Copy | Download

ACS Style

Xue Yingfei. TideDB - A Distributed, Scalable Time Series Database. Internet Things Cloud Comput. 2017, 5(3), 59-63. doi: 10.11648/j.iotcc.20170503.14

Copy | Download

AMA Style

Xue Yingfei. TideDB - A Distributed, Scalable Time Series Database. Internet Things Cloud Comput. 2017;5(3):59-63. doi: 10.11648/j.iotcc.20170503.14

Copy | Download

@article{10.11648/j.iotcc.20170503.14,
  author = {Xue Yingfei},
  title = {TideDB - A Distributed, Scalable Time Series Database},
  journal = {Internet of Things and Cloud Computing},
  volume = {5},
  number = {3},
  pages = {59-63},
  doi = {10.11648/j.iotcc.20170503.14},
  url = {https://doi.org/10.11648/j.iotcc.20170503.14},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.iotcc.20170503.14},
  abstract = {Some of the largest datasets have strong time components, like machine monitoring, real-time alert and IoT devices, etc. Despite of so many applications of time series data, most storage options are either highly proprietary or worse, relational. Unlike other alternatives, TideDB does not have a data with multiple metrics broken down into multiple data with one metric that increases the pressure on system throughput dramatically, rather its data modeling based on the computed column and tag words index can provide high write throughput, low read latency, and petabytes storage. TideDB has been deployed in production settings on large clusters to manage multiple terabytes of storage at Taide Company. The paper describes the TideDB how to store and organize our time series data from about one hundred thousand devices and millions service modules.},
 year = {2017}
}

Copy | Download

TY  - JOUR
T1  - TideDB - A Distributed, Scalable Time Series Database
AU  - Xue Yingfei
Y1  - 2017/08/07
PY  - 2017
N1  - https://doi.org/10.11648/j.iotcc.20170503.14
DO  - 10.11648/j.iotcc.20170503.14
T2  - Internet of Things and Cloud Computing
JF  - Internet of Things and Cloud Computing
JO  - Internet of Things and Cloud Computing
SP  - 59
EP  - 63
PB  - Science Publishing Group
SN  - 2376-7731
UR  - https://doi.org/10.11648/j.iotcc.20170503.14
AB  - Some of the largest datasets have strong time components, like machine monitoring, real-time alert and IoT devices, etc. Despite of so many applications of time series data, most storage options are either highly proprietary or worse, relational. Unlike other alternatives, TideDB does not have a data with multiple metrics broken down into multiple data with one metric that increases the pressure on system throughput dramatically, rather its data modeling based on the computed column and tag words index can provide high write throughput, low read latency, and petabytes storage. TideDB has been deployed in production settings on large clusters to manage multiple terabytes of storage at Taide Company. The paper describes the TideDB how to store and organize our time series data from about one hundred thousand devices and millions service modules.
VL  - 5
IS  - 3
ER  -

Copy | Download

Author Information

Xue Yingfei

Research and Development Department, Tide Cloud Company, Shanghai, China

Download PDF

Sections

Plain Text BibTeX RIS

APA Style

Xue Yingfei. (2017). TideDB - A Distributed, Scalable Time Series Database. Internet of Things and Cloud Computing, 5(3), 59-63. https://doi.org/10.11648/j.iotcc.20170503.14

Copy | Download

ACS Style

Xue Yingfei. TideDB - A Distributed, Scalable Time Series Database. Internet Things Cloud Comput. 2017, 5(3), 59-63. doi: 10.11648/j.iotcc.20170503.14

Copy | Download

AMA Style

Xue Yingfei. TideDB - A Distributed, Scalable Time Series Database. Internet Things Cloud Comput. 2017;5(3):59-63. doi: 10.11648/j.iotcc.20170503.14

Copy | Download

@article{10.11648/j.iotcc.20170503.14,
  author = {Xue Yingfei},
  title = {TideDB - A Distributed, Scalable Time Series Database},
  journal = {Internet of Things and Cloud Computing},
  volume = {5},
  number = {3},
  pages = {59-63},
  doi = {10.11648/j.iotcc.20170503.14},
  url = {https://doi.org/10.11648/j.iotcc.20170503.14},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.iotcc.20170503.14},
  abstract = {Some of the largest datasets have strong time components, like machine monitoring, real-time alert and IoT devices, etc. Despite of so many applications of time series data, most storage options are either highly proprietary or worse, relational. Unlike other alternatives, TideDB does not have a data with multiple metrics broken down into multiple data with one metric that increases the pressure on system throughput dramatically, rather its data modeling based on the computed column and tag words index can provide high write throughput, low read latency, and petabytes storage. TideDB has been deployed in production settings on large clusters to manage multiple terabytes of storage at Taide Company. The paper describes the TideDB how to store and organize our time series data from about one hundred thousand devices and millions service modules.},
 year = {2017}
}

Copy | Download

TY  - JOUR
T1  - TideDB - A Distributed, Scalable Time Series Database
AU  - Xue Yingfei
Y1  - 2017/08/07
PY  - 2017
N1  - https://doi.org/10.11648/j.iotcc.20170503.14
DO  - 10.11648/j.iotcc.20170503.14
T2  - Internet of Things and Cloud Computing
JF  - Internet of Things and Cloud Computing
JO  - Internet of Things and Cloud Computing
SP  - 59
EP  - 63
PB  - Science Publishing Group
SN  - 2376-7731
UR  - https://doi.org/10.11648/j.iotcc.20170503.14
AB  - Some of the largest datasets have strong time components, like machine monitoring, real-time alert and IoT devices, etc. Despite of so many applications of time series data, most storage options are either highly proprietary or worse, relational. Unlike other alternatives, TideDB does not have a data with multiple metrics broken down into multiple data with one metric that increases the pressure on system throughput dramatically, rather its data modeling based on the computed column and tag words index can provide high write throughput, low read latency, and petabytes storage. TideDB has been deployed in production settings on large clusters to manage multiple terabytes of storage at Taide Company. The paper describes the TideDB how to store and organize our time series data from about one hundred thousand devices and millions service modules.
VL  - 5
IS  - 3
ER  -

Copy | Download