1.1 type detection function
select toTypeName(0); -- UInt8 select toTypeName(-0); -- UInt8 select toTypeName(1000); -- UInt16 select toTypeName(-1000); -- Int16 select toTypeName(10000000); -- UInt32 select toTypeName(-10000000); -- Int32 select toTypeName(1.99); -- Float64 select toTypeName(toFloat32(1.99)); -- Float32 select toTypeName(toDate('2019-12-12')) as dateType, toTypeName(toDateTime('2019-12-12 12:12:12')) as dateTimeType; -- Date | DateTime select toTypeName([1,3,5]); -- Array(UInt8)
1.2 mathematical functions
|multiply||quadrature||select multiply(2, 2)||=4|
|divide||division||select divide(6, 2)|
select divide(10, 0)
select divide(0, 0)
|intDiv||Integer division||select intDiv(10, 3)||=3|
|intDivOrZero||Calculation quotient||select intDivOrZero(5,2)||=2|
|modulo||remainder||select modulo(10, 3)||=1|
|abs||absolute value||select abs(-10)||=10|
|gcd||greatest common divisor||select gcd(12, 24)||=12|
|lcm||Least common multiple||select lcm(12, 24)||=24|
1.3 time function
select now() as curDT,toYYYYMM(curDT),toYYYYMMDD(curDT),toYYYYMMDDhhmmss(curDT); select toDateTime('2019-12-16 14:27:30') as curDT; select toDate('2019-12-12') as curDT;
2. Use of update / delete
In terms of usage scenarios, Clickhouse is an analytical database. In this scenario, the data is generally unchanged, so Clickhouse's support for update and delete is relatively weak. In fact, it does not support standard update and delete operations.
The following describes the use of update and delete in Clickhouse.
Update and delete syntax:
Clickhouse updates and deletes through alter. It calls update and delete operations mutation. The syntax is:
ALTER TABLE [db.]table DELETE WHERE filter_expr ALTER TABLE [db.]table UPDATE column1 = expr1 [, ...] WHERE filter_expr
So, what is the difference between mutation and standard update and delete?
The update and delete operations of standard SQL are synchronous, that is, the client needs to wait for the server to return the execution result (usually the int value);
The Clickhouse's update and delete are implemented asynchronously. When the update statement is executed, the server will return immediately, but in fact, the data has not changed at this time, but wait in line.
CREATE TABLE tbl_test_users( id UInt64, email String, username String, gender UInt8, birthday Date, mobile FixedString(13), pwd String, regDT DateTime, lastLoginDT DateTime, lastLoginIP String ) ENGINE=MergeTree() partition by toYYYYMMDD(regDT) order by id settings index_granularity=8192;
Insert data into the table of the MergeTree engine
insert into tbl_test_users(id, email, username, gender, birthday, mobile, pwd, regDT, lastLoginDT, lastLoginIP) values (1,'firstname.lastname@example.org','nameless',2,'1992-05-31','13306834911','7f930f90eb6604e837db06908cc95149','2008-08-06 11:48:12','2015-05-08 10:51:41','18.104.22.168'),(2,'email@example.com','Li Si',1,'1983-10-11','15302753472','7f930f90eb6604e837db06908cc95149','2008-08-10 05:37:32','2014-07-28 23:43:04','22.214.171.124'),(3,'firstname.lastname@example.org','Wang Wu',1,'1970-11-22','15200570030','96802a851b4a7295fb09122b9aa79c18','2008-08-10 11:37:55','2014-07-22 23:45:47','126.96.36.199'),(4,'email@example.com','Zhao Liu',1,'2002-02-10','15207308903','96802a851b4a7295fb09122b9aa79c18','2008-08-10 14:47:09','2013-12-26 15:55:02','188.8.131.52'),(5,'firstname.lastname@example.org','pseudo-ginseng',1,'1987-09-01','13005861359','96802a851b4a7295fb09122b9aa79c18','2008-08-12 21:58:11','2013-12-26 15:52:33','184.108.40.206');
ALTER TABLE tbl_test_users UPDATE username='Zhang San' WHERE id=1; select * from tbl_test_users;
ALTER TABLE tbl_test_users DELETE WHERE id=1; select * from tbl_test_users;
Viewing the mutation queue
How to check whether the data has been updated?
You can view relevant information through the system.mutations table:
SELECT database, table, command, create_time, is_done FROM system.mutations ORDER BY create_time DESC LIMIT 10;
database: Library name table: Table name command: to update/Delete statement create_time: mutation Task creation time. The system processes data changes in this chronological order is_done: Is it complete? 1 is complete, 0 is incomplete
Through the above information, you can view which changes have been completed, is_ When done is 1, it means that it has been completed.
Specific process of Mutation
First, use the where condition to find the partition to be modified;
Then, rebuild each partition and replace the old partition with a new partition. Once the partition is replaced, it cannot be rolled back;
For each partition, it can be considered atomic; However, for the entire mutation, if multiple partitions are involved, it is not atomic.
matters needing attention
The update feature does not support updating columns about primary keys or partition keys
The update operation is not atomic, that is, during the update process, some of the select results are likely to change and some remain unchanged. You can know from the specific process above
Updates are performed in the order of submission
Once the update is submitted, it cannot be revoked. Even if the clickhouse service is restarted, it will continue to execute in the order of system.mutations
Entries that have been updated will not be deleted immediately, and the number of reserved entries is determined by finished_mutations_to_keep storage engine parameter determination. When the amount of data is exceeded, the old entries are deleted
Updates may get stuck. For example, if the update intvalue='abc 'type of incorrect UPDATE statement fails to execute, it will remain stuck here. At this time, you can use kill music to cancel.
kill mutation where database='app' and table='test' -- database,table yes system.mutations Fields in table
According to the official instructions, the use scenario of update/delete is to update a large amount of data at one time, that is, the result of where condition filtering should be a large amount of data.
For example: alter table test update status = 1, where status = 0 and day ='2020-04-01 ', update the data for one day at a time.
So, can you update only one piece of data at a time? For example: alter table test update pv=110 where id=100
Of course, you can, but frequent such operations may cause pressure on the service. This is easy to understand. As mentioned above, the updated unit is the partition. If only one piece of data is updated, a partition needs to be rebuilt; If 100 pieces of data are updated and they may fall on 3 partitions, 3 partitions need to be rebuilt; Relatively speaking, the overall efficiency of updating a batch of data at a time is much higher than updating a row at a time.
For the scenario of frequent single update, it is recommended to use the replacing mergetree / collapsing mergetree engine to solve it in disguise.